VISION[AI,BGB] - www.SailDart.org

perm filename VISION[AI,BGB] blob sn#131888 filedate 1974-11-26 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00011 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	{λ30I40,0P1⊂CFAFBFCαBAUMGART / QUAMJAFA}
C00003 00003	⊂0.	INTRODUCTION.⊃
C00007 00004	{JC} ________________________________________
C00010 00005	⊂1.	COMPUTER VISION THEORY.⊃
C00014 00006	⊂2.	APPLIED VISION/ROBOTICS SOFTWARE.⊃
C00016 00007	⊂3.	SYSTEMS SOFTWARE.⊃
C00019 00008	⊂4.	HARDWARE.⊃
C00021 00009	~ROBOTICS RELATED HARDWARE...~
C00025 00010	⊂5.	PERSONEL AND ADMINISTRATION.⊃
C00028 00011	⊂6.	ARPA PROPOSAL FRAGMENTS.⊃
C00030 ENDMK
C⊗;
{λ30;I40,0;P1;⊂C;FA;FB;FC;αBAUMGART / QUAM;JAFA}
STANFORD ARTIFICIAL INTELLIGENCE LABORATORY {JR} NOVEMBER 1974


{JC;FB} BAUMGART / QUAM
{JC;FD} VISION RESEARCH POSITION PAPER
{JC;FD} (subsuming DRAFT material for the ARPA PROPOSAL 1975/76)

{W250;λ10;JAFB}
0.	INTRODUCTION.
1.	COMPUTER VISION THEORY.
2.	APPLIED VISION SOFTWARE.
3.	SYSTEMS SOFTWARE.
4.	HARDWARE.
5.	PERSONEL AND ADMINISTRATION.
6.	ARPA PROPOSAL FRAGMENTS.

{W0;λ20;JUFA}
⊂0.	INTRODUCTION.⊃
{JUFA}
	The recent vision work  of Lynn Quam may be  characterized as
low  level  image analysis  performed  on sequences  of  high quality
raster images.  Quam's analysis has involved  performing correlation,
calibration and normalization  of images so that  measurements of the
world  can be  accurately  made.   From accurate  visual measurements
world models such  as 3-D  contour maps or  regional depth maps  have
been automatically computed.  

	The recent vision work of Bruce Baumgart may be characterized
as polygonal blob analysis of  low quality images. From sequences  of
image blobs  (2-D regions  delimited by  polygons), space blobs  (3-D
regions   delimited  by   polyhedra)  have  been   constructed  which
approximate the objects  being viewed.   Baumgart's overall  approach
may be characterized as inverse computer graphics - television images
are  used  to construct  models  which contain  sufficient geometric,
topological and photometric data such that arbitrary new views can be
computed. 

	Common to both approachs are three fundamental elements which
characterize the Baumgart-Quam position: parallax experiments, metric
analysis, and  description/verification into accurate  3-D mechanical
world models rather than recognition into semantic world models. 

	The approachs to date have differed in the quality of initial
images used. The correlation  techniques require reasonably  accurate
video  images which  Quam  has obtained  by  having images  digitized
outside  of  Stanford;  whereas,  the  blob  regional  techniques was
adopted as an  attempt to deal with  the existing Stanford system  of
second   rate  video   hardware.     The   approachs   also  differed
significantly in the nature of  3-D models, however neither of  these
differences indicate any  disagreement in the fundamental  science of
computer  vision, but  rather  a difference  of approach  to  the low
quality Stanford visual hardware. 
{JC} ________________________________________
	This document is  intended to be an internal  SAIL memorandum
prepared in  answer to a specific request from JMC  to BGB and PDQ to
combine written research  plans suitable for  the next ARPA  proposal
with  our   written  list  of  grievances   concern  present  project
management, research, hardware, software, and working conditions. 
For the purposes of planning and complaining we wish to break
the  problem space into three  time period and six  subject levels as
listed in Box 1 below. 
{|}
BOX 1{JCFAλ6;} SIX PLANNING LEVELS AND THREE TIME PERIODS.
{JAFA;T200,400;}
LEVEL #1	THEORY	Scientific Vision Research Theory.
LEVEL #2	PROGRAMMING	Applied Vision Research Programming.
LEVEL #3	SYSTEMS	Vision Related Systems Software.
LEVEL #4	HARDWARE	Vision Related Systems Hardware.
LEVEL #5	PERSONEL	All Vision/Robotics Related People.
LEVEL #6	ENVIRONMENT	Light, Temperature, Noise, Tables and Chairs.

PERIOD #1	PRESENT	From now until when the PDP-11 and Zonker are installed.
PERIOD #2	SOON	From the end of period #1 until when the KL-10 is installed.
PERIOD #3	LATER	From the end of period #2 until
		 the end of the next two year ARPA contract, 1977.
{T-1;λ20;|;JUFA}
	The final section of this memorandum will be composed of
earlier paragraphs expurgate for use as a section in the proposal
to ARPA, 1975/76, for vision/robotics research.

⊂1.	COMPUTER VISION THEORY.⊃

	Perception is essential to intelligence as  it is the process
which converts external sensations into internal thoughts.  There are
two  kinds  of  simple  perception  systems:  stimulus-response   and
prediction-correction  feedback.  The  usual  (or  first  generation)
computer   vision  paradigm  is  stimulus  response  by  2-D  feature
extraction and  statistical patern  recognition.  Although much  such
work  remains to  be  done; we  wish to  pursue  faithful descriptive
vision for  the  sake  of developing  a  prediction-correction  based
perception system.

{λ7;JA} ~ELEMENTS OF BAUMGART/QUAM VISION THEORY~
	1. PARALLAX - study sequence of images from a moving camera.
	2. CORRELATION - between successive images.
	3. MEASUREMENT - obtain accurate geometric and photometric data.
	4. DESCRIPTION - anaylsis of images into numerical 3-D models.
	5. VERIFICATION - compare perceived with predicted.
	6. RECOGNITION  - at 3-D as well as the 2-D level.
{λ20;JUFA}

	The approach can be described in terms of the elements listed
in  Box 2  above.   ~Parallax~  is the  most  important, unambiguous,
unprejudiced depth clue - by taking sequences of images from slightly
different camera positions  the continuity of the world  in space and
time is not lost and the third dimension is gained. Cyclopean sessile
vision is a severe  handicap to perception of 3-D  space; furthermore
it  is  a  handicap that  can  be  removed  with current  technology.
~Correlation~  techniques   establish  the   correspondence   between
successive images in time,  or between a binocular pair  or images in
space,  or   between  perceived  and  predicted  images  in  feedback
(comparing real time images with mental images). Further  research in
correlation is required to... 

{λ7;JA} ~VISION RELATED MATHEMATICAL PROBLEMS~
	1. Representation of Space and Objects.
	2. Representation of Curved Objects.
	3. Manifold Resurfacing
{λ20;JUFA}
⊂2.	APPLIED VISION/ROBOTICS SOFTWARE.⊃
{JAFA;λ10;}
IMAGE PROCESSING SOFTWARE
	Stereo pair bulk correlation.
	Camera Calibration.
	Photometric and Geometric Normalization.
	Conversion between image representations: video → 2-D mosaic.
	Synthesis of 3-D models from sequences of 2-D images.

SPATIAL MODELING SOFTWARE
	SPace filling lattice Modeling.
	Collision Avoidance
	Fast Proximity Detection/Intersection Methods.
	3-D Maze solving for navigation/manipulation planning
	
PHYSICAL SIMULATION SOFTWARE
	Mechanical simulation,
	Creature simulation,
	Semantic simulation,
	Task Simulation for collision and manipulation.
{JUFA;λ20;}
⊂3.	SYSTEMS SOFTWARE.⊃

	We  (PDQ and  BGB)  are concerned  that  the Rubin/Gorin  new
PDP-10 Time Sharing System planning has not included specific concern
for facillitating the successor to SAIL and LISP. Specifically memory
management,  process  structure,   symbol  tables,  primary/secondary
storage  communication, code-rellocation, code-linking, management of
library routines and system support for yesterday's computer graphics
(much less tommorrow's) should be included.

~High Level Language successor to SAIL and LISP.~
   STEAM - Stanford Translator, Editor, Algol and Memory.
   QUAIL - Quam's Universal A.I. Language.

	The next generation  of ordinary general  purpose programming
language  should  be  a  well  integrated  triple  of  high  language
(ALGOL-like), low  language (machine  environment) and  user  control
language  (E-like  text  editor)  that   serves  the  purpose  of  an
incremental  compiler.  The language  must provide a Record  Structure
(nodes and  links) that  automatically can access  very large  memory
(disks) as well as public  data.  The language must provide means for
incrementally creating a very large system of code. 

\~Memory Management~ - Data Structure Paging, Information Retrieval,
Subroutine Library Management.

\~System support for Computer Graphics~ - for example a CUSP file format
for graphics/text/video that is tolerated by the text editors and
programming language handling programs is an essential minimum that Gorin
explicitly wishs to ignore.

⊂4.	HARDWARE.⊃

	Although we could make good use all of the following items
we expect to receive support for about half to two thirds of them.
{JA;λ5;}
~DISPLAY HARDWARE~
1. High Quality Vector Display Device - $20,000 to $100,000
2. Color Raster Display Device for Zonker Memory  -  $2,000
	Two Sony Tritrons off the shelf.
2. High Quality Hardcopy Graphics Device - $50,000
	
~VISION RELATED HARDWARE~
1. Finish PDP-11/SPS-41 vision/robotics system - $5,000
2. Two Low Cost Television Cameras for stereo vision - $5,000
	Two cameras and a Scheinman Pan/Tilt shoulder support.
3. One Low Cost Television Camera for hand held camera experiment - $2,000
4. High Quality Film Scanning Device - $50,000
5. One Color Television Camera - $15,000
{JUFA}
~ROBOTICS RELATED HARDWARE...~

1. Robot Repair Fund - $3,000
	Money to  be spent solely  on robot  repairs as necessary  as
determined by the  researchers.  This money is not to be spent by the
laboratory administrators, and left over money (should no  repairs be
necessary) shall be returned to the government. 

2. New Cart - $10,000

3. Turntable for Hand/Eye Table  -  $500

	A computer controlled turntable is a  simple effective way an
object. The turntable  should be operated by the computer rather than
manually in order to free research personel for other activities. 

4. Vision/Robotics Control Room - $10,000

	Money to be spent on providing a system of lighting, heating,
air-conditioning,    noise-abatement,    isolation,    communication,
security,  safety,   table-space,   storage-space  and   chairs   for
controling vision  and robotics  experiments. Current  facilities are
subject  to wide variations in temperature  (40-100 degrees F.), high
noise  levels,  unintend-sabotage,  minor  accidents   and  generally
uncomfortable research conditions. 

5. One Robot (with arms and legs and two cameras) - $1,500,000

	We  wish   to  seriously  propose   that  for   the  National
Bicentennial that ARPA/IPT can demonstrate on 4 July 1976 an actually
working android that  can walk,  run, climb, pick  things up,  handle
tools and which  can see.  The demonstation robot  would require both
an  "on-board" as  well as  a remote  computer which would  require a
high-quality wide-band width communications link.  The robot would be
animated  by  electric  motors  similar  to  those  already  used  in
mechanical manipulators.  Although a  true robot's power supply  is a
considerable problem, we  would use some  short lived supply  such as
batteries,  fuel  cells,  or  a  small  gasoline  engine to  power  a
generator. 

⊂5.	PERSONEL AND ADMINISTRATION.⊃

	This section is a SAIL internal document consisting of
intramural protesting, criticism, confrontation and polemic.

~Our own sins:~

\BGB - Has written too much flaky undocumented software which is largly
incompatible with SAIL. Complains too much and attempts to fix hardware
on his own, substituting enthusiasm for technical skill.

PDQ -

~Gripes about other people:~

\JMC - Has been working too much on non A.I. Project things:
News Service, Energy Conservation, etc.

\LES - Has misallocated resources to non A.I. Project things:
Music, Art, Vending Machine, Voder, Audio Switch, Dial Out.
Has denied support to maintain Cart, Turntable
and television cameras.

TOB - 
	
~Alternatives:~

\1. ~Preserve status quo:~ Binford in charge of Hand/Eye; Quam in charge
of  Perception; Baumgart  formally in  perception but  actually doing
hand/eye work. 

\2. ~Fusion:~ the  Hand/Eye  and the  Perception  projects should  be
combined with Binford in charge of hand/eye theory (table top scenes)
and philosophy of vision; Baumgart in charge of hardware liason (thru
Gafford)  and lower  level vision/graphics  programming; and  Quam in
charge  of perception  theory (natural  scenes) and  image processing
research programming. 

\3. ~Fission:~ Each RA run a separate research program: Binford, hand/eye;
Quam, perception; Baumgart, vision/graphics.

⊂6.	ARPA PROPOSAL FRAGMENTS.⊃

~ONE GOLD BULLET:~

	If funded and  if the project administrators  redress current
grievenesses  (December  1974), PDQ (Quam)  and  BGB  (Baumgart) will
undertake  research  towards  making  a  computer  system   that  can
automatically  aquire  three  dimensional models  from  sequences  of
images taken in natural world as well as table top environments 

~SEVEN SILVER BULLETS:~

	1. Descriptive Vision Example...
	2. Verification Vision...
	3. Geometric Modeling...
	4. Image Processing...
	5. Robot Explorer...
	6. 
(Some Arm thing that doesn't resemble anything in the NSF contract).
		6a. Perception using a hand held camera.
		6b. Roto Tool Sculpture of a model.
	7. Recognition Vision...