Optimization based automated curation of metabolic reconstructions.docx
《Optimization based automated curation of metabolic reconstructions.docx》由会员分享,可在线阅读,更多相关《Optimization based automated curation of metabolic reconstructions.docx(24页珍藏版)》请在冰点文库上搜索。
Optimizationbasedautomatedcurationofmetabolicreconstructions
Optimizationbasedautomatedcurationofmetabolicreconstructions
VinaySatishKumar,1MadhukarSDasika,2andCostasDMaranas
2
1DepartmentofIndustrialandManufacturingEngineering,ThePennsylvaniaStateUniversity,UniversityPark,PA16802,USA
2DepartmentofChemicalEngineering,ThePennsylvaniaStateUniversity,UniversityPark,PA16802,USA
Correspondingauthor.
VinaySatishKumar:
vsk111@psu.edu;MadhukarSDasika:
msd179@psu.edu;CostasDMaranas:
costas@psu.edu
ReceivedDecember14,2006;AcceptedJune20,2007.
ThisisanOpenAccessarticledistributedunderthetermsoftheCreativeCommonsAttributionLicense(http:
//creativecommons.org/licenses/by/2.0),whichpermitsunrestricteduse,distribution,andreproductioninanymedium,providedtheoriginalworkisproperlycited.
Top
Abstract
Background
Results
Discussionandconclusion
Methods
Authors'contributions
SupplementaryMaterial
References
Abstract
Background
Currently,thereexiststensofdifferentmicrobialandeukaryoticmetabolicreconstructions(e.g.,Escherichiacoli,Saccharomycescerevisiae,Bacillussubtilis)withmanymoreunderdevelopment.Allofthesereconstructionsareinherentlyincompletewithsomefunctionalitiesmissingduetothelackofexperimentaland/orhomologyinformation.Akeychallengeintheautomatedgenerationofgenome-scalereconstructionsistheelucidationofthesegapsandthesubsequentgenerationofhypothesestobridgethem.
Results
Inthiswork,anoptimizationbasedprocedureisproposedtoidentifyandeliminatenetworkgapsinthesereconstructions.Firstweidentifythemetabolitesinthemetabolicnetworkreconstructionwhichcannotbeproducedunderanyuptakeconditionsandsubsequentlyweidentifythereactionsfromacustomizedmulti-organismdatabasethatrestorestheconnectivityofthesemetabolitestotheparentnetworkusingfourmechanisms.Thisconnectivityrestorationishypothesizedtotakeplacethroughfourmechanisms:
a)reversingthedirectionalityofoneormorereactionsintheexistingmodel,b)addingreactionfromanotherorganismtoprovidefunctionalityabsentintheexistingmodel,c)addingexternaltransportmechanismstoallowforimportationofmetabolitesintheexistingmodelandd)restoreflowbyaddingintracellulartransportreactionsinmulti-compartmentmodels.Wedemonstratethisprocedureforthegenome-scalereconstructionofEscherichiacoliandalsoSaccharomycescerevisiaewhereincompartmentalizationofintra-cellularreactionsresultsinamorecomplextopologyofthemetabolicnetwork.Wedeterminethatabout10%ofmetabolitesinE.coliand30%ofmetabolitesinS.cerevisiaecannotcarryanyflux.Interestingly,thedominantflowrestorationmechanismisdirectionalityreversalsofexistingreactionsintherespectivemodels.
Conclusion
Wehaveproposedsystematicmethodstoidentifyandfillgapsingenome-scalemetabolicreconstructions.Theidentifiedgapscanbefilledbothbymakingmodificationsintheexistingmodelandbyaddingmissingreactionsbyreconcilingmulti-organismdatabasesofreactionswithexistinggenome-scalemodels.Computationalresultsprovidealistofhypothesestobequeriedfurtherandtestedexperimentally.
Top
Abstract
Background
Results
Discussionandconclusion
Methods
Authors'contributions
SupplementaryMaterial
References
Background
Thegenomeofseveralmicroorganismshasbeencompletelysequencedandannotatedinthepastdecade[1-4].Thisinformationhasaidedthemetabolicreconstructionsofseveralmicrobialandeukaryoticorganismsusingexperimentalevidenceandbioinformaticsbasedtechniquesprovidingsinglecompartment(e.g.,Escherichiacoli[5])andmulti-compartmentmodels(e.g.,Saccharomycescerevisiae[6]).Allofthesereconstructionsareinherentlyincompletewithsomefunctionalitiesmissingduetothelackofexperimentaland/orhomologyinformation.Thesemissingreactionstepsmayleadtothepredictionoferroneousgeneticinterventionsforatargetedoverproductionortheelucidationofmisleadingorganizationalprinciplesandpropertiesofthemetabolicnetwork.Akeychallengeintheautomatedgenerationofgenome-scalereconstructionsistheelucidationofthesegapsandthesubsequentgenerationofhypothesestobridgethem.Thischallengehasalreadybeenrecognizedandanumberofcomputationalapproacheshavebeenunderdevelopmenttoresolvethesediscrepancies[7-11].
Mostoftheaforementionedeffortsarebasedontheuseofsequencehomologytouncovermissinggenes.Specifically,sequencehomologyisusedtopinpointgenesinrelatedspeciesthathavesignificantsimilaritywithanunassignedORFofthecuratedmicroorganism[12].GreenetalformalizedandfurtherextendedthisconceptbyintroducingamethodthatidentifiedmissingenzymesinametabolicnetworkusingsequencehomologyrelatedmetricswithinaBayesianframework[11].Alternatively,non-homologybasedreconstructionshavebeenimplementedbyidentifyingcandidategenesbymeasuringthesimilaritywithmetricssuchasmRNAcoexpressiondata[8]andphylogeneticprofiles[10]whilealsotakingintoaccountthelocalstructureoftheexistingpartiallyreconstructedmetabolicnetworks.Arecentadvancementinthisdirectionusesmultipletypesofassociationevidenceincludingclusteringofgenesonthechromosomeandproteinfusioneventsinadditiontophylogeneticprofiles[9].Allmethodsdescribedabovepostulateasetofcandidategenesandthenevaluatethelikelihoodthatanyofthesecandidategenesispresentinthemicroorganism'smetabolicnetworkofinterestusingavarietyofscoringmetrics.Inadditiontotheseapproaches,variousgenomiccontextanalyseshavealsobeenusedtoidentifymissingmetabolicgenes[7,13-16].Specifically,arecentstudyexploitstheavailabilityofhighlycuratedmetabolicnetworkstohypothesizegenereactioninteractionsinlesscharacterizedorganisms[16].Theseaforementionedmethodspredictmissingenzymesinthemetabolicnetworkbyconductingsequencebasedcomparisonsofentiregenomesandinferringpossiblemetabolicfunctionsacrossdifferentmicroorganisms.
Alternatively,arecentsystemsbasedcomputationalapproachidentifiesthelocationofmissingmetabolicfunctionsintheE.coliiJR904modelbypinpointingdiscrepanciesbetweeninsilicomodelpredictionsandknowninvivogrowthphenotypes[17].Subsequently,anoptimizationbasedalgorithmisusedtoresolvethesediscrepanciesbysearchingformissingmetabolicfunctionsfromacandidatedatabaseofreactions.Inthispaperinstead,wepinpointmetabolitesthatcannotcarryanyfluxthroughthemandsubsequentlygeneratehypothesestorestoreconnectivity.Tothisend,weintroduceanoptimizationbasedprocedure(GapFind)tofirstidentifysuchgapsinbothsingleandmulti-compartmentmetabolicnetworksandsubsequentlyusinganoptimizationbasedprocedure(GapFill)restoretheirconnectivityusingseparatepathologyresolvinghypotheses.Incontrasttothepreviousmethodswhichfillgapsonlybyidentifyingmissingenzymes[8-11,17]oraddingtransportreactions[17],wealsoexplorewhetherthesegapscanbefilledbymakingintramodelmodifications.Figure1pictoriallyillustrateshowsuchgapsariseinmetabolicreconstructionsandintroducesthedefinitionsproposedinthispapertopreciselydescribethesepathologies.
Figure1
Characterizationofproblemmetabolitesinmetabolicnetworks.MetaboliteAisdefinedasarootno-productionmetabolitebecausethereisno-productionortransportmechanismforitinthenetwork.MetaboliteCisadownstreamno-productionmetabolite(more...)
Gapsinmetabolicreconstructionsaremanifestedas(i)metaboliteswhichcannotbeproducedbyanyofthereactionsorimportedthroughanyoftheavailableuptakepathwaysinthemodel;or(ii)metabolitesthatarenotconsumedbyanyofthereactionsinthenetworkorexportedbasedonanyexistingsecretionpathways.Werefertothesemetabolitesasrootno-production(e.g.,metaboliteA)androotno-consumptionmetabolites(e.g.,metaboliteB)respectively.Atsteady-stateconditionsnoflowcanpassthroughthemduetotheincompleteconnectivitywiththerestofthenetwork.Clearly,suchpathologiesarenotphysiologicallyrelevantandthusmustbecausedbyomissionand/orerrorsinthemodelreconstructionprocess.Notably,thelackofflowinrootno-productionmetabolitesandrootno-consumptionmetabolitesispropagateddownstream/upstreamrespectivelygivingrisetoadditionalmetabolitesthatcannotcarryanyflow.Werefertothesemetabolitesthatareindirectlypreventedfromcarryingflowasdownstreamno-production(e.g.,metaboliteC)metabolitesandupstreamno-consumptionmetabolites(e.g.,metaboliteD)respectively.Itisimportanttonotethatbyrestoringconnectivityfortherootproblemmetabolitesallupstream/downstreammetabolitesarealsoautomaticallyfixed.Weconcentrateonresolvingonlyno-productionmetabolitesinthecaseofcytosolicmetabolites.Inthecaseofnon-cytosolic(i.e.,presentininternalcompartments)metabolites,weidentifymechanismstoresolvebothno-productionandno-consumptionmetabolites.
Forsinglecompartmentmetabolicnetworks(wherewehaveonlycytosolicmetabolites),wepostulatethreeseparatemechanismsforfixingno-productionmetabolites(seealsoFigure2).Weexplorewhether(i)reversingthedirectionalityofexistingreactionsinthemodel(Mechanism1),(ii)addingnewreactionsfromamulti-speciesdatabase(e.g.,MetaCyc[18])(Mechanism2)orfinally(iii)allowingforthedirectimportationoftheproblemmetaboliterestoresflowintotheno-productionmetabo