PROGRESS REPORT (YEAR 3)
SUBMITTED TO THE NORTH CENTRAL SOYBEAN RESEARCH PROGRAM

1. Project title: A Public EST Project for Soybean

2. Names of principle investigators
     Dr. Randy C. Shoemaker, USDA-ARS, Iowa State University
     Dr. Paul S. Keim, Northern Arizona University
     Dr. Lila Vodkin, University of Illinois
     Dr. Ernest Retzel, University of Minnesota
     Dr. David Smoller, Incyte Genomics
     Dr. Robert Waterston, Washington University

3. Starting and ending project dates
     1 March 1998 - 28 February 2002

4. Period for this progress report
     1 March 2000- 28 February 2001 (YEAR THREE)
     This report was prepared during the week of December 25, 2000.

     Progress Reports for previous years can be found here.

5. Description of progress made towards achieving the stated research objectives

Gene Library Construction and Gene De-Coding
The primary objective of Year 3 was to continue the development and characterization of a sufficient number of cDNA libraries to allow the project to generate high-quality sequences at high rates of throughput for the duration of the grant. Another objective of year 3 was to make 100,000 sequence attempts and to deposit the `good' sequences into a public database.

The three laboratories responsible for the development of the cDNA libraries have divided the responsibilities according to expertise. The Keim laboratory prepares cDNA libraries from root-related tissues or environmental conditions (SCN infection, nodulation, etc.) The Vodkin laboratory is responsible for development of libraries from developing seed, pods, and seed coats. Dr. Vodkin is also responsible for administering the National Science Foundation project on functional genomics that is a direct offshoot of this project. The Shoemaker laboratory is responsible for developing libraries from all other above-ground plant parts (leaves, stems, flowers, etc.). Dr. Shoemaker is also responsible for administering this project.

The following table is a summary of cDNA libraries already developed. More than 70 libraries have been created but some are duplicated and thus are only shown once in the table.

Library Source Cultivar
Shoot tips Williams
Shoot tips Williams 82
Whole Stems (2-3 wk old) Williams
Whole Stems (2-3 wk old) Williams 82
Whole stem (2-3 wk old seedlings) Williams 82
Whole Plant (2-3 wk old) Williams
Leaf (2-3 wk old) Williams 82
Leaf (2-3 wk old) Williams
Leaf (senescing) Williams
Vegetative buds (adult) Williams 82
Flower buds (adult) Williams 82
Mature flowers Williams 82
Immature flowers Williams 82
Whole root (seedling) Williams
Cotyledons (mid-maturity) Williams
Cotyledons (degenerating) Williams
Cotyledons (very young) Williams
Cotyledons (young) Williams
Cotyledons (immature) Williams
Cotyledons (8 day) Williams
Cotyledons (3 and 7 day) Williams
Cotyledons (immature) Williams
Cotyledons (young) Williams
Seed coat (mid-maturity) Williams
Seed coat (immature) T157
Seed coat (very young) Williams
Seed coats T157 (a pigmented isoline of Richland)
Seed coats (immature) Williams
Seed coats (very young) Williams
Whole Roots (8 day old, no nutrients) Williams
Mature root Williams
Mature root `Supernod'
Seedling root (with nutrients) Williams
Seedling root Delsoy
Seedling root Williams
Denodulated roots Williams
Nodules Williams
Roots (Glycine max "supernod" plants) Supernod NTS382
Whole roots (2 month old plants, mature, Williams
   with nutrients, lacking nodules)
Heat shocked seedlings Williams
Hypocotyl Williams
Hypocotyl (3 day old) Williams
Hypocotyl, Plumule (germinating seeds) Williams
Hypocotyl (9-10 day old) Williams
Hypocotyl (9-10 day old) Williams 82
Hypocotyl (3 day old seedlings) Williams 82
Hypocotyl (9-10 day etiolated seedlings) Williams
Seeds (germinating) Williams
Leaf (fully expanded) (2 wk old) Williams
Leaf (senescing, mature) Williams 82
Leaf (senescing, mature) Williams
Leaf (immature, unfurled trifoliate) Williams
Leaf and shoot tip (salt stressed, 2 wk old) Williams
Leaf (2-3 wk old seedlings) Williams
Leaf (2-3 wk old seedlings) Williams 82
Pod (immature) Williams
Whole young pods (12 wk) Williams
Seedling (2-3 wk old) PI 567374
Seedling (2-3 wk old) Williams 82
Whole seedling (minus cotyledons) Ogden
Whole seedling (minus cotyledons) Williams
Seedling (7 day old) Delsoy 5710
Whole Seedling (3 wk old) Harosoy (Lf1Lf1lnlny9y9Pd1dt1dt1L1L1)
Whole seedlings (2-3 wk old) Williams
Vegetative buds (adult) Williams 82
Flowers (immature) Williams 82
Flowers (mature) Williams 82
Apical shoot tips (9-10 day old etiolated seedlings) Williams
Apical shoot tips (9-10 day old etiolated seedlings) Williams 82
Somatic Embryos Jack
__________________________________________________

The libraries have been assayed for quality and clones have been sent to Washington University. The libraries from developing seed (cotyledons) have been treated specially by Dr. Vodkin's group so that highly-redundant clones were eliminated. This decreases the cost of generating novel data from that organ/tissue source.

Our collaborators at Incyte Genomics (name changed from Genome Systems) (Dr. David Smoller, et al.) have initiated the procedures for creating copies of all the libraries we have sent through them to Washington University. They provide a copy of the library to the sequencing center, provide a copy back to the laboratory that created the library, and store copies at their facilities so that the public can access the clones when needed.

During the course of this project, hundreds of soybean cDNAs were specifically requested by soybean researchers and obtained from Genome Systems through this project. Each clone obtained in this way represents a soybean gene that has been cloned without the usual costs and time loss associated with independent projects. When appropriate, entire libraries are being obtained from the individual labs preparing them.

The responsibility of Washington University is to generate sequences of the cDNAs, establish software for the automated processing of sequences, evaluate those sequences, remove contaminating sequences, annotate and deposit those sequences into the public database. Dr. Retzel at the University of Minnesota has the responsibility to provide computer tools to facilitate the analysis of the DNA sequences generated by this project.

As of 15 December 2000, a total of 136,958 soybean gene sequences have been deposited into the public database. (NOTE: As of 11 May 2001, a total of 168,157 ESTs have been deposited into dbEST.) The soybean EST project is now the # 1 plant species in the NIH dbEST database. No other plants have this many cDNA sequences in the public sector. Many other clones (thousands) are in the `pipeline' of production or annotation. New libraries are continually being prepared, and we will begin assaying those as soon as they are received. Thanks to the ability of Washington University to `carry-over' funds from year 2, we expect that the anticipated 125,000 attempts (for year 3) will be completed by March 1, 2000. This puts on on schedule. We expect to make 100,000 attempts during the fourth and final year. This should bring the total number of soybean ESTs to well above 200,000.

During this year, the research team and postdoctorals have met at Northern Arizona University, the University of Illinois and in San Diego, CA to discuss progress, strategies and priorities.

Bioinformatics and Computer Management of EST Data
Another objective of the project is to develop computerized methods to efficiently sort through and query the vast amounts of raw data (gene codes) that will be generated through this project.

Dr. Retzel, University of Minnesota, has begun to develop bioinformatics tools to help analyze the EST data. He is in the process of modifying existing computer programs so that they can be used to include information on the tissue and organ source of each cDNA library produced in this project. This will allow us to gather information about gene expression in different parts of the soybean and during different environmental conditions.

Dr. Retzel and Dr. Shoemaker, in collaboration with other researchers, are submitting a grant proposal to the National Science Foundation (FY 2001) in an attempt to leverage additional federal dollars and commodity dollars into a general `legume' bioinformatics project. In addition, the USDA-ARS is now providing financial support to Dr. Retzel to expand his efforts at developing methods to handle large amounts of DNA sequence.

Shoemaker has established additional collaborations with biocomputing faculty at Iowa State University to enhance the project's ability to evaluate and analyze data.

6. Listing of significant problems encountered and actions taken in response

The costs of robotic equipment unexpectedly increased for Washington University. This was associated with a decreased cost of personnel and increased the cost of materials and supplies. However, the NCSRP approved the movement of funds from one category to another with no change in the total amount of funds to be expended. This flexibility has allowed Washington University to continue on schedule.

7. Are objectives expected to be completed on time?

The project, "A Public EST Project for Soybean," is on schedule and is currently experiencing no problems. We expect to complete our long-term objectives (4-year project) on time.

8. Additional important comments

cDNA library development for the fourth and final year will emphasize pathogen challenged tissue. During this final year cDNA libraries will be created from soybean tissues inoculated with various disease pathogens (Phytophthora, Psuedomonas, Soybean Mosaic Virus, etc.).

At least three proposals are being submitted to the National Science Foundation this year that are directly dependent upon this EST project!

Soybean researchers throughout the U.S. are grateful for your support.

Return to the Soybean EST Home Page