Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx
《Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx》由会员分享,可在线阅读,更多相关《Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx(60页珍藏版)》请在冰点文库上搜索。
,Handouts,TellmesomethingIdontknow,4,Notallfactsaboutlanguageareaccessiblethroughnative-speakerintuitionItsoftenimpossibletodescribemeaningsofwordswithoutreferencetothecontextsthosewordsoccurinSometimesyouneedalotofdatatoseepatternsSometimesyouneedtocount,Whatacorpuscanhelpussee,5,causehasnegativesemanticprosodyprovidehaspositivesemanticprosody(seeLouw1993;
Stubbs1995)ingeneral,passiveisusedaround10%ofthe,timeinEnglish,(seeHalliday1991),Corpus,acollectionofnaturally-occurringtextsthataretheobjectofliteraryorlinguisticstudyheldinelectronicformthusamenableto(semi-)automaticprocessingusuallyassembledinsomeprincipledwayoftenhighlystructured,containingverycarefullyselectedtextscompiledforapurposeeglinguisticanalysis,lexicography,NaturalLanguageProcessing,translationresearchoraby-productofanotheractivity,egparliamentarydebates(andtheirtranslationsin,bilingualormultilingualparliaments),6,Well-knownmonolingualcorpora,7,BritishNationalCorpus(100mwords)http:
/www.natcorp.ox.ac.uk/TheBankofEnglish(524mwords)http:
/www.collins.co.uk/books.aspx?
group=153freeaccesstoasampleof56mwordsathttp:
/www.collins.co.uk/Corpus/CorpusSearch.aspTheentireWorldWideWebhttp:
/www.webcorp.org.uk/,Corporaandcorpusprocessingsoftware,8,Provide:
a(relatively)objectivebasisforcommentaryonlinguisticphenomena,orlinguisticrealizationsof,eg,socialphenomenaaresourceforquantitativeandqualitativeresearchtheabilitytoaccessandmanipulate(sort,display,annotate)vastquantitiesofdata,thusfacilitatinganalysisbyhumans,Accesstocorpora,9,webaccessBankofEnglish,BNC,IDSfreeaccesstosampleraccesstolargercorpusbysubscription/permissionpurchaseowncopy(egBNC)createyourowncorpus,Webaccesstocorpora:
corpusprocessingsoftware,10,Web-accessiblecorporahavededicatedinterfacesthatallowusers,eg,to:
choosetexts/sub-corporasearchthecorpusforinstancesofaparticularword(resultsinKWICconcordanceformat)displayandsometimessortresults,Samplesearch:
CobuildBankofEnglish,11,Sampleresults:
CobuildBankofEnglish,12,Webaccesstocorpora:
corpusprocessingsoftware,13,Moresophisticatedinterfaces,eg,CosmasI*(IDSinMannheim),allowedsearchesfor:
allformsofalemmagehengetsgehe,ging,gehst,geht,etc.allcompoundsformedfromasearchwordMbelgetsGartenmbel,Mbellager,etc.allformsderivedfromasearchwordKindgetsKindchen,kindlich,etc.*replacedin2003byCosmasII,Buildingyourowncorpus-basicdesignissues,14,WrittenvsspokenvswrittenandspokenStaticvsdynamic(monitor)Synchronic(timeofproduction?
)vsdiachronicGeneralreferencevsspecialisedMonolingualvsbilingualvsmultilingualDomainstobecoveredTexttypestobecoveredLevelofannotation:
rawvsannotationwithextra-textualinfo(headerstoindicatetexttitle,author,speakers,etc)vsdetailedlinguisticannotation,Rawcorpus/cleantext,15,Example:
Asheweakened,Moranbecameafraidofhisdaughters.ThisoncepowerfulmanwassoimplantedintheirlivesthattheyhadneverreallyleftGreatMeadow,inspiteofjobsandmarriagesandchildrenandhousesoftheirowninDublinandLondon.Nowtheycouldnotlethimslipaway.,fromJohnMcGahernsAmongstWomen,Corpusannotation:
structuralandPOStagging,16,ExtractfromtheBritishNationalCorpus:
Asheweakened,Moranbecameafraidofhisdaughters.ThisoncepowerfulmanwassoimplantedintheirlivesthattheyhadneverreallyleftGreatMeadow,inspiteofjobsandmarriagesandchildrenandhousesoftheirowninDublinandLondon.Nowtheycouldnotlethimslipaway.fromJohnMcGahernsAmongstWomen,Buildingyourowncorpus-GettingtheTexts,17,FulltextsvstextsamplesSamplingrandomSamplingorhandpicking?
CopyrightPermissionRepresentativenessdifficultconcepttoapplytotextualdataonusisonresearchertodocumentcorpuscontentsverycarefully,Conversiontoelectronicform?
18,Textnotavailableinelectronicform:
Scanning+OCRvsKeyboardingTextavailableinelectronicformWebpages(seeKilgarriffetal2006)DownloadsfromwebsitesFull-textdatabasesDonationsfromauthors/translatorsFormattextsshouldbesavedin?
.txt?
.html?
.xml?
(knowwhatyoursoftwarecanhandle!
)Isalignmentnecessary?
DoingResearchUsingCorpora,19,Corporacanhelpusanswerquestions,buttheyalsogeneratequestionsIngeneral,passiveisusedaround10%ofthetimeinEnglish;
butnearly20%oftheusesofcauseasaverbinmyscientificcorpusareinthepassivevoiceDoesthistellmesomethingabouttheverbcause?
scientificEnglish?
mysampleofscientificEnglish?
Hypothesisgenerationandtesting,20,HypothesisatentativeclaimegcausetendstobeusedmoreinthepassivethanotherEnglishverbshypothesessometimesemergefromcorpusdatacorpus-drivenresearchhypothesessometimesformedbeforewelookatcorpusdatacorpus-basedresearchnewlygeneratedhypothesescanbetestedagainstother(oftenbigger)corporacyclesofhypothesisformation,testing,refinement,testingcommonincorpusresearch,Someearlydescriptivehypothesesintranslationstudies,21,Translationstendtobesimpler/moreexplicit/moreconventionalthan:
othertextsinthesamelanguagetheirsourcetexts,Buthowdoyouoperationalizenotionslikesimplification,explicitation,normalizationincorpus-basedresearch?
Eg,whatconcretefeaturesofatextshowittobesimplerthananothertext?
(Seeespecially,Baker1993,1995,1996),Earlyphase:
Developingmethodologiesandresources,22,EgQuestion:
Aretranslationsreallysimplified?
EgAnswer:
Comparedtowhat?
sourcetexts?
=parallelcorpusmethodologyothertextsinthetargetlanguage?
=comparablecorpusmethodology,ParallelCorpus,setofsourcetextsinlanguageAalongsidetheirtranslationsintolanguageB(andperhapslanguagesC,D,E)canbebilingualormultilingualegEnglish-NorwegianParallelCorpus(ENPC)http:
/www.hf.uio.no/ilos/forskning/forskningsprosjekter/enpc/egEuropeanParliamentProceedingsParallelCorpushttp:
/www.statmt.org/europarl/canbeunidirectionalorbidirectionalegGerman-EnglishParallelCorpusofLiteraryTextsde-,en,egENPCen-noandno-en,23,ComparableCorpus,24,MonolingualComparableCorpussetoftextstranslatedintoalanguageAalongsidetextsoriginallywritteninthatsamelanguageTranslationalEnglishCorpus(TEC)http:
/comparablesubsetoftheBNCFinnishComparableCorpus(Mauranen2004),EarlyCorpus-basedTS:
QuantitativeBias?
25,AttempttoapproachtranslationobjectivelyRelianceonpropertiesoftextthatcanbemeasuredAveragewordlengthAveragesentencelengthLexicaldensityType-tokenratio,etcFocusonmonolingualcomparablecorporaFocusongeneraltendencies(oruniversals)intranslation(Laviosa2002;
MauranenandKujamki2004),LexicalDensity,26,theratioofcontentwordstothetotalnumberofwordsinatextImadethechemicalshottersotheychanged.(4/8=50%)Raisingthetemperatureproducedachemicalchange.(5/7=71%)(fromGibbons2003:
20)thelowerthelexicaldensity,thesimplerthetext.,Type-TokenRatio,27,theratiooftypestotokensinacorpus,egtheTTRforThecatsatonthematnear,thelogfireis8/10(or80%)assumedtomeasurethe(lexical)varietyinatext:
thehighertheTTR,themorevariedthetextsvocabulary,TTR:
AnExample,28,InthispieceIrailagainstthetendencyoflinguiststowriteaboutthephilosophyofscienceasappliedtotheirsubjectfieldinsteadofwritingaboutwhatlanguagesarelike,whichiswhatlinguistsaresupposedtobegoodat.UnsympatheticcriticswillnodoubtchargethatbydoingsoIinstantiatetheverykindofbehaviorthatIamrailingagainst.FromPullum1991:
123TTR=49/63=78%,TTRcontinued,29,InthispieceIrailagainstthetendencyoflinguiststowriteaboutthephilosophyofscienceasappliedtotheirsubjectfieldinsteadofwritingaboutwhatlanguagesarelike,whichiswhatlinguistsaresupposedtobegoodat.UnsympatheticcriticswillnodoubtchargethatbydoingsoIinstantiatetheverykindofbehaviorthatIamrailingagainst.Thisisnotso.Iamcomplainingaboutunproductivemetaleveldiscussion,whichconsistsoflinguiststalkingaboutdoinglinguisticsinsteadofdoingit.Byofferingacritiqueofsuchwork,Iamoperatingatameta-metalevel,talkingaboutlinguiststalkingaboutdoinglinguisticsinsteadofdoingit.Thereisadifference.TTR=66/115=57%,Type/TokenRatio,30,isextremelysensitivetotextlengththelongerthetextis,thelowertheTTRis(normally)solution:
calculatetheTTRforsuccessivechunksoftexts(egevery100or1,000words),andthentakeanaveragecountattheend(standardizedTTR)butevenstandardizedTTRsareproblematicasamarkerofsimplicity/complexityastheycantcapturethedifferencebetweenhardwordsandeasywordsegfirevsconflagration,OperationalizingSimplification,31,Iftranslatedtextsaresomehowsimplerthanoriginaltextsinthesamelanguage,thentheymighthave:
shorteraveragewordlengthshorteraveragesentencelengthlowerlexicaldensitylowerstandardizedtype-tokenratioscomparedtooriginals,32,InitialResults:
InvestigatingSimplification(Laviosa1998a),33,Corpus:
newspaperarticlestranslatedintoEnglish,andnewspaperarticlesoriginallywritteninEnglishTool:
WordListinWordSmithToolsResultslexicaldensity:
lowerintranslatedarticlesaveragesentencelength:
lowerintranslatedarticlesstandardizedtype/tokenratio:
nosignificantdifference,InitialResults:
InvestigatingSimplification(Laviosa1998b),34,Corpus:
fictiontranslatedin