perm filename VISION[AI,BGB] blob
sn#131888 filedate 1974-11-26 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00011 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 {λ30I40,0P1⊂CFAFBFCαBAUMGART / QUAMJAFA}
C00003 00003 ⊂0. INTRODUCTION.⊃
C00007 00004 {JC} ________________________________________
C00010 00005 ⊂1. COMPUTER VISION THEORY.⊃
C00014 00006 ⊂2. APPLIED VISION/ROBOTICS SOFTWARE.⊃
C00016 00007 ⊂3. SYSTEMS SOFTWARE.⊃
C00019 00008 ⊂4. HARDWARE.⊃
C00021 00009 ~ROBOTICS RELATED HARDWARE...~
C00025 00010 ⊂5. PERSONEL AND ADMINISTRATION.⊃
C00028 00011 ⊂6. ARPA PROPOSAL FRAGMENTS.⊃
C00030 ENDMK
C⊗;
{λ30;I40,0;P1;⊂C;FA;FB;FC;αBAUMGART / QUAM;JAFA}
STANFORD ARTIFICIAL INTELLIGENCE LABORATORY {JR} NOVEMBER 1974
{JC;FB} BAUMGART / QUAM
{JC;FD} VISION RESEARCH POSITION PAPER
{JC;FD} (subsuming DRAFT material for the ARPA PROPOSAL 1975/76)
{W250;λ10;JAFB}
0. INTRODUCTION.
1. COMPUTER VISION THEORY.
2. APPLIED VISION SOFTWARE.
3. SYSTEMS SOFTWARE.
4. HARDWARE.
5. PERSONEL AND ADMINISTRATION.
6. ARPA PROPOSAL FRAGMENTS.
{W0;λ20;JUFA}
⊂0. INTRODUCTION.⊃
{JUFA}
The recent vision work of Lynn Quam may be characterized as
low level image analysis performed on sequences of high quality
raster images. Quam's analysis has involved performing correlation,
calibration and normalization of images so that measurements of the
world can be accurately made. From accurate visual measurements
world models such as 3-D contour maps or regional depth maps have
been automatically computed.
The recent vision work of Bruce Baumgart may be characterized
as polygonal blob analysis of low quality images. From sequences of
image blobs (2-D regions delimited by polygons), space blobs (3-D
regions delimited by polyhedra) have been constructed which
approximate the objects being viewed. Baumgart's overall approach
may be characterized as inverse computer graphics - television images
are used to construct models which contain sufficient geometric,
topological and photometric data such that arbitrary new views can be
computed.
Common to both approachs are three fundamental elements which
characterize the Baumgart-Quam position: parallax experiments, metric
analysis, and description/verification into accurate 3-D mechanical
world models rather than recognition into semantic world models.
The approachs to date have differed in the quality of initial
images used. The correlation techniques require reasonably accurate
video images which Quam has obtained by having images digitized
outside of Stanford; whereas, the blob regional techniques was
adopted as an attempt to deal with the existing Stanford system of
second rate video hardware. The approachs also differed
significantly in the nature of 3-D models, however neither of these
differences indicate any disagreement in the fundamental science of
computer vision, but rather a difference of approach to the low
quality Stanford visual hardware.
{JC} ________________________________________
This document is intended to be an internal SAIL memorandum
prepared in answer to a specific request from JMC to BGB and PDQ to
combine written research plans suitable for the next ARPA proposal
with our written list of grievances concern present project
management, research, hardware, software, and working conditions.
For the purposes of planning and complaining we wish to break
the problem space into three time period and six subject levels as
listed in Box 1 below.
{|}
BOX 1{JCFAλ6;} SIX PLANNING LEVELS AND THREE TIME PERIODS.
{JAFA;T200,400;}
LEVEL #1 THEORY Scientific Vision Research Theory.
LEVEL #2 PROGRAMMING Applied Vision Research Programming.
LEVEL #3 SYSTEMS Vision Related Systems Software.
LEVEL #4 HARDWARE Vision Related Systems Hardware.
LEVEL #5 PERSONEL All Vision/Robotics Related People.
LEVEL #6 ENVIRONMENT Light, Temperature, Noise, Tables and Chairs.
PERIOD #1 PRESENT From now until when the PDP-11 and Zonker are installed.
PERIOD #2 SOON From the end of period #1 until when the KL-10 is installed.
PERIOD #3 LATER From the end of period #2 until
the end of the next two year ARPA contract, 1977.
{T-1;λ20;|;JUFA}
The final section of this memorandum will be composed of
earlier paragraphs expurgate for use as a section in the proposal
to ARPA, 1975/76, for vision/robotics research.
⊂1. COMPUTER VISION THEORY.⊃
Perception is essential to intelligence as it is the process
which converts external sensations into internal thoughts. There are
two kinds of simple perception systems: stimulus-response and
prediction-correction feedback. The usual (or first generation)
computer vision paradigm is stimulus response by 2-D feature
extraction and statistical patern recognition. Although much such
work remains to be done; we wish to pursue faithful descriptive
vision for the sake of developing a prediction-correction based
perception system.
{λ7;JA} ~ELEMENTS OF BAUMGART/QUAM VISION THEORY~
1. PARALLAX - study sequence of images from a moving camera.
2. CORRELATION - between successive images.
3. MEASUREMENT - obtain accurate geometric and photometric data.
4. DESCRIPTION - anaylsis of images into numerical 3-D models.
5. VERIFICATION - compare perceived with predicted.
6. RECOGNITION - at 3-D as well as the 2-D level.
{λ20;JUFA}
The approach can be described in terms of the elements listed
in Box 2 above. ~Parallax~ is the most important, unambiguous,
unprejudiced depth clue - by taking sequences of images from slightly
different camera positions the continuity of the world in space and
time is not lost and the third dimension is gained. Cyclopean sessile
vision is a severe handicap to perception of 3-D space; furthermore
it is a handicap that can be removed with current technology.
~Correlation~ techniques establish the correspondence between
successive images in time, or between a binocular pair or images in
space, or between perceived and predicted images in feedback
(comparing real time images with mental images). Further research in
correlation is required to...
{λ7;JA} ~VISION RELATED MATHEMATICAL PROBLEMS~
1. Representation of Space and Objects.
2. Representation of Curved Objects.
3. Manifold Resurfacing
{λ20;JUFA}
⊂2. APPLIED VISION/ROBOTICS SOFTWARE.⊃
{JAFA;λ10;}
IMAGE PROCESSING SOFTWARE
Stereo pair bulk correlation.
Camera Calibration.
Photometric and Geometric Normalization.
Conversion between image representations: video → 2-D mosaic.
Synthesis of 3-D models from sequences of 2-D images.
SPATIAL MODELING SOFTWARE
SPace filling lattice Modeling.
Collision Avoidance
Fast Proximity Detection/Intersection Methods.
3-D Maze solving for navigation/manipulation planning
PHYSICAL SIMULATION SOFTWARE
Mechanical simulation,
Creature simulation,
Semantic simulation,
Task Simulation for collision and manipulation.
{JUFA;λ20;}
⊂3. SYSTEMS SOFTWARE.⊃
We (PDQ and BGB) are concerned that the Rubin/Gorin new
PDP-10 Time Sharing System planning has not included specific concern
for facillitating the successor to SAIL and LISP. Specifically memory
management, process structure, symbol tables, primary/secondary
storage communication, code-rellocation, code-linking, management of
library routines and system support for yesterday's computer graphics
(much less tommorrow's) should be included.
~High Level Language successor to SAIL and LISP.~
STEAM - Stanford Translator, Editor, Algol and Memory.
QUAIL - Quam's Universal A.I. Language.
The next generation of ordinary general purpose programming
language should be a well integrated triple of high language
(ALGOL-like), low language (machine environment) and user control
language (E-like text editor) that serves the purpose of an
incremental compiler. The language must provide a Record Structure
(nodes and links) that automatically can access very large memory
(disks) as well as public data. The language must provide means for
incrementally creating a very large system of code.
\~Memory Management~ - Data Structure Paging, Information Retrieval,
Subroutine Library Management.
\~System support for Computer Graphics~ - for example a CUSP file format
for graphics/text/video that is tolerated by the text editors and
programming language handling programs is an essential minimum that Gorin
explicitly wishs to ignore.
⊂4. HARDWARE.⊃
Although we could make good use all of the following items
we expect to receive support for about half to two thirds of them.
{JA;λ5;}
~DISPLAY HARDWARE~
1. High Quality Vector Display Device - $20,000 to $100,000
2. Color Raster Display Device for Zonker Memory - $2,000
Two Sony Tritrons off the shelf.
2. High Quality Hardcopy Graphics Device - $50,000
~VISION RELATED HARDWARE~
1. Finish PDP-11/SPS-41 vision/robotics system - $5,000
2. Two Low Cost Television Cameras for stereo vision - $5,000
Two cameras and a Scheinman Pan/Tilt shoulder support.
3. One Low Cost Television Camera for hand held camera experiment - $2,000
4. High Quality Film Scanning Device - $50,000
5. One Color Television Camera - $15,000
{JUFA}
~ROBOTICS RELATED HARDWARE...~
1. Robot Repair Fund - $3,000
Money to be spent solely on robot repairs as necessary as
determined by the researchers. This money is not to be spent by the
laboratory administrators, and left over money (should no repairs be
necessary) shall be returned to the government.
2. New Cart - $10,000
3. Turntable for Hand/Eye Table - $500
A computer controlled turntable is a simple effective way an
object. The turntable should be operated by the computer rather than
manually in order to free research personel for other activities.
4. Vision/Robotics Control Room - $10,000
Money to be spent on providing a system of lighting, heating,
air-conditioning, noise-abatement, isolation, communication,
security, safety, table-space, storage-space and chairs for
controling vision and robotics experiments. Current facilities are
subject to wide variations in temperature (40-100 degrees F.), high
noise levels, unintend-sabotage, minor accidents and generally
uncomfortable research conditions.
5. One Robot (with arms and legs and two cameras) - $1,500,000
We wish to seriously propose that for the National
Bicentennial that ARPA/IPT can demonstrate on 4 July 1976 an actually
working android that can walk, run, climb, pick things up, handle
tools and which can see. The demonstation robot would require both
an "on-board" as well as a remote computer which would require a
high-quality wide-band width communications link. The robot would be
animated by electric motors similar to those already used in
mechanical manipulators. Although a true robot's power supply is a
considerable problem, we would use some short lived supply such as
batteries, fuel cells, or a small gasoline engine to power a
generator.
⊂5. PERSONEL AND ADMINISTRATION.⊃
This section is a SAIL internal document consisting of
intramural protesting, criticism, confrontation and polemic.
~Our own sins:~
\BGB - Has written too much flaky undocumented software which is largly
incompatible with SAIL. Complains too much and attempts to fix hardware
on his own, substituting enthusiasm for technical skill.
PDQ -
~Gripes about other people:~
\JMC - Has been working too much on non A.I. Project things:
News Service, Energy Conservation, etc.
\LES - Has misallocated resources to non A.I. Project things:
Music, Art, Vending Machine, Voder, Audio Switch, Dial Out.
Has denied support to maintain Cart, Turntable
and television cameras.
TOB -
~Alternatives:~
\1. ~Preserve status quo:~ Binford in charge of Hand/Eye; Quam in charge
of Perception; Baumgart formally in perception but actually doing
hand/eye work.
\2. ~Fusion:~ the Hand/Eye and the Perception projects should be
combined with Binford in charge of hand/eye theory (table top scenes)
and philosophy of vision; Baumgart in charge of hardware liason (thru
Gafford) and lower level vision/graphics programming; and Quam in
charge of perception theory (natural scenes) and image processing
research programming.
\3. ~Fission:~ Each RA run a separate research program: Binford, hand/eye;
Quam, perception; Baumgart, vision/graphics.
⊂6. ARPA PROPOSAL FRAGMENTS.⊃
~ONE GOLD BULLET:~
If funded and if the project administrators redress current
grievenesses (December 1974), PDQ (Quam) and BGB (Baumgart) will
undertake research towards making a computer system that can
automatically aquire three dimensional models from sequences of
images taken in natural world as well as table top environments
~SEVEN SILVER BULLETS:~
1. Descriptive Vision Example...
2. Verification Vision...
3. Geometric Modeling...
4. Image Processing...
5. Robot Explorer...
6.
(Some Arm thing that doesn't resemble anything in the NSF contract).
6a. Perception using a hand held camera.
6b. Roto Tool Sculpture of a model.
7. Recognition Vision...