UNIVERSITY OF MARYLAND
Large-scale genome sequence analysis has become an integral part of nearly all research areas in biology. Genome sequencing costs have dropped sharply but there has not been a commensurate increase in computational resources to support large-scale data processing and analysis. To address this gap we propose to build the Data Intensive Academic Grid (DIAG), a resource that will include a computational infrastructure, a high-performance storage network, and optimized data sets generated by mining the data from public data repositories like Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (CAMERA) and National Center for Biotechnology Information (NCBI).
The DIAG will leverage technologies developed for existing grid resources such as NSF supported TeraGrid, and Open Science Grid (OSG). We will also take advantage of a new technology called Virtual Machines (VMs) that will overcome the limitations that genomics researchers have previously experienced when they used large monolithic grid systems.
The DIAG computational infrastructure will include between 800-1200 cores for high throughput computational analysis and between 80-120 cores for high-performance computational analysis. To handle the large amounts of data generated by such experiments we will deploy 500 Terabytes of clustered high-performance storage. The bioinformatics community will access the DIAG using Ergatis, a web based pipeline creation and management tool, bioinformatics oriented VMs, as well as interactive and programmatic access using technologies such as Nimbus, and the Virtual Data Toolkit from the Open Science Grid. These will allow users to build applications that leverage localization of custom genomic data sets and computational resources to perform analyses previously too difficult for groups with limited informatics support.
We have recruited a constellation of over 26 genomic analysis service providers, tool developers and scientists with specific research interests in evolutionary analysis, microbial analysis, metagenomics, metatranscriptomics, plant genomics, proteomics and transcriptomics. These collaborators will be provided with an unprecedented level of computational power to accomplish their genomics-related activities and will receive a guaranteed number of computer cycles, a suite of tools to access the DIAG, training and engineering support. We will also partner with other academic grid service providers to develop standards and interoperation strategies to robustly share resources across a decentralized computer network. We will identify the most effective mode of operation for our users and grid partners with the goal of eventually applying for funding to replicate the optimized model across several geographically distributed centers. Our ultimate aim is to create a long-term sustainable analysis paradigm to effectively meet the needs of the larger genomics community. The DIAG will have significant national impact. The researchers and educators who will use the system are distributed geographically across the US. The software produced at our site will be easily utilized by NSF funded projects such as the Globus Workspaces Project, Open Science Grid and other projects involving large multi-institutional collaborations. In addition the DIAG will provide spillover capacity for two other science grids (MG-RAST and CAMERA) located in Illinois and California.
To train the next generation of biologists, we have partnered with educators who will use the DIAG to create a unique training environment. This will bring a powerful new analysis system to a total of 340 undergraduate, graduate, and post graduate students annually in 15 different classroom settings. We also have allocated time on the DIAG to over 90 universities with predominantly underrepresented minorities and 75 NSF-funded women researchers in science, greatly enhancing the available computation and computing infrastructure for these groups.
Choose a quarter and click "Go."
| AWARD OVERVIEW |
| Award Number |
0959894 |
Funding Agency |
National Science Foundation |
| Total Award Amount |
$1,894,381 |
Project Location - City |
Baltimore |
| Award Date |
01/20/2010 |
Project Location - State |
MD |
| Project Status |
More than 50% Completed |
Project Location - Zip |
21201-1508
|
| Jobs Reported |
1.33 |
Congressional District |
07 |
| Project Location - Country |
US |
|
|
Recipient Information
(Grants)
| Recipient Information (Grants) |
|
Recipient Name
|
UNIVERSITY OF MARYLAND |
| Recipient DUNS Number |
188435911
|
| Recipient Address |
220 ARCH ST RM 02148 |
| Recipient City |
BALTIMORE |
| Recipient State |
Maryland |
| Recipient Zip |
21201-1531 |
| Recipient Congressional District |
07 |
| Recipient Country |
USA |
Required to Report Top 5 Highly Compensated Officials |
No |
Projects and Jobs Information
| Projects and Jobs Information |
| Project Title |
MRI-R2: Acquisition of Data Intensive Academic Grid (DIAG)-A computational platform for bioinformatics analyses and training |
| Project Status |
More than 50% Completed |
| Final Project Report Submitted |
No |
| Project Activities Description |
Medical Research, General/Other |
| Quarterly Activities/Project Description |
As defined in the award description field. |
| Jobs Created |
1.33 |
| Description of Jobs Created |
Systems Programmer; Web Developer; Manager, Software Eng, BioInfo |
Purchaser Information
(Grants)
| Purchaser Information |
| Contracting Office ID |
Not Reported |
| Contracting Office Name |
Not Available |
| Contracting Office Region |
Not Available |
| TAS Major Program |
49-0101 |
| Award Information |
| Award Date |
01/20/2010 |
| Award Number |
0959894 |
| Order Number |
|
| Award Type |
Grants |
| Funding Agency ID |
49 |
| Funding Agency Name |
National Science Foundation |
| Funding Office Name |
Not Available |
| Awarding Agency ID |
49 |
| Awarding Agency Name |
National Science Foundation |
| Amount of Award |
$1,894,381 |
| Funds Invoiced/Received |
$1,760,825 |
| Expenditure Amount |
$1,789,325 |
| Infrastructure Expenditure Amount |
$0 |
| Infrastructure Purpose and Rationale |
Not Reported |
| Infrastructure Point of Contact Name |
Not Reported |
| Infrastructure Point of Contact Email |
Not Reported |
| Infrastructure Point of Contact Phone |
Not Reported |
| Infrastructure Point of Contact Address |
Not Reported |
| Infrastructure Point of Contact City |
Not Reported |
| Infrastructure Point of Contact State |
Not Reported |
| Infrastructure Point of Contact Zip |
Not Reported |
Product or Service Information
(Grants)
| Product or Service Information |
| Primary Activity Code |
H01 |
| Activity Description |
Medical Research, General/Other |
| Sub-Awards Information |
| Sub-awards to Organizations |
0 |
| Sub-award Amounts to Organizations |
$0 |
| Sub-Awards to Individuals |
0 |
| Sub-Award Amounts to Individuals |
$0 |
| Number of Sub-awards less than $25,000/award |
0 |
| Amount of Sub-awards less than $25,000/award |
$0 |
| Number of payments to vendors greater than $25,000 |
2 |
| Total Amount of payments to vendors greater than $25,000/award |
$1,156,916 |
| Number of payments to vendors less than $25,000/award |
10 |
| Total Amount of payments to vendors less than $25,000/award |
$47,088 |
EMC Corporation - Award Number 0959894 - EMC Corporation
| Award Number |
0959894 |
| Sub-Award Number |
N/A |
| Vendor DUNS Number |
097447148 |
| Vendor HQ Zip Code + 4 |
01748-2209 |
| Vendor Name |
EMC Corporation |
| Product and Service Description |
Purchase EMC VNX 5100 computer storage system with extended support |
| Payment Amount |
$34,310 |
IBM Corporation - Award Number 0959894 - IBM Corporation
| Award Number |
0959894 |
| Sub-Award Number |
N/A |
| Vendor DUNS Number |
967190179 |
| Vendor HQ Zip Code + 4 |
27709-0154 |
| Vendor Name |
IBM Corporation |
| Product and Service Description |
Purchase of IBM System X computer system and GPFS software for bioinformatic analysis. |
| Payment Amount |
$1,122,606 |
| Location Information |
| Latitude, Longitude |
39º 17' 28",
-76º 37' 32" |
| Congressional District |
07 |
| Address 1 |
|
| Address 2 |
|
| City |
Baltimore |
| County |
Baltimore City |
| State |
MD |
| Zip |
21201-1508 |
|
 |