Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx

上传人:wj 文档编号:476570 上传时间:2023-04-29 格式:PPTX 页数:60 大小:214.19KB
下载 相关 举报
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第1页
第1页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第2页
第2页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第3页
第3页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第4页
第4页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第5页
第5页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第6页
第6页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第7页
第7页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第8页
第8页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第9页
第9页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第10页
第10页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第11页
第11页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第12页
第12页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第13页
第13页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第14页
第14页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第15页
第15页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第16页
第16页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第17页
第17页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第18页
第18页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第19页
第19页 / 共60页
Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx_第20页
第20页 / 共60页
亲,该文档总共60页,到这儿已超出免费预览范围,如果喜欢就下载吧!
下载资源
资源描述

Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx

《Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx》由会员分享,可在线阅读,更多相关《Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx(60页珍藏版)》请在冰点文库上搜索。

Using Corpora in Translation Studies Research - The :翻译研究中使用语料库研究.pptPPT格式课件下载.pptx

,Handouts,TellmesomethingIdontknow,4,Notallfactsaboutlanguageareaccessiblethroughnative-speakerintuitionItsoftenimpossibletodescribemeaningsofwordswithoutreferencetothecontextsthosewordsoccurinSometimesyouneedalotofdatatoseepatternsSometimesyouneedtocount,Whatacorpuscanhelpussee,5,causehasnegativesemanticprosodyprovidehaspositivesemanticprosody(seeLouw1993;

Stubbs1995)ingeneral,passiveisusedaround10%ofthe,timeinEnglish,(seeHalliday1991),Corpus,acollectionofnaturally-occurringtextsthataretheobjectofliteraryorlinguisticstudyheldinelectronicformthusamenableto(semi-)automaticprocessingusuallyassembledinsomeprincipledwayoftenhighlystructured,containingverycarefullyselectedtextscompiledforapurposeeglinguisticanalysis,lexicography,NaturalLanguageProcessing,translationresearchoraby-productofanotheractivity,egparliamentarydebates(andtheirtranslationsin,bilingualormultilingualparliaments),6,Well-knownmonolingualcorpora,7,BritishNationalCorpus(100mwords)http:

/www.natcorp.ox.ac.uk/TheBankofEnglish(524mwords)http:

/www.collins.co.uk/books.aspx?

group=153freeaccesstoasampleof56mwordsathttp:

/www.collins.co.uk/Corpus/CorpusSearch.aspTheentireWorldWideWebhttp:

/www.webcorp.org.uk/,Corporaandcorpusprocessingsoftware,8,Provide:

a(relatively)objectivebasisforcommentaryonlinguisticphenomena,orlinguisticrealizationsof,eg,socialphenomenaaresourceforquantitativeandqualitativeresearchtheabilitytoaccessandmanipulate(sort,display,annotate)vastquantitiesofdata,thusfacilitatinganalysisbyhumans,Accesstocorpora,9,webaccessBankofEnglish,BNC,IDSfreeaccesstosampleraccesstolargercorpusbysubscription/permissionpurchaseowncopy(egBNC)createyourowncorpus,Webaccesstocorpora:

corpusprocessingsoftware,10,Web-accessiblecorporahavededicatedinterfacesthatallowusers,eg,to:

choosetexts/sub-corporasearchthecorpusforinstancesofaparticularword(resultsinKWICconcordanceformat)displayandsometimessortresults,Samplesearch:

CobuildBankofEnglish,11,Sampleresults:

CobuildBankofEnglish,12,Webaccesstocorpora:

corpusprocessingsoftware,13,Moresophisticatedinterfaces,eg,CosmasI*(IDSinMannheim),allowedsearchesfor:

allformsofalemmagehengetsgehe,ging,gehst,geht,etc.allcompoundsformedfromasearchwordMbelgetsGartenmbel,Mbellager,etc.allformsderivedfromasearchwordKindgetsKindchen,kindlich,etc.*replacedin2003byCosmasII,Buildingyourowncorpus-basicdesignissues,14,WrittenvsspokenvswrittenandspokenStaticvsdynamic(monitor)Synchronic(timeofproduction?

)vsdiachronicGeneralreferencevsspecialisedMonolingualvsbilingualvsmultilingualDomainstobecoveredTexttypestobecoveredLevelofannotation:

rawvsannotationwithextra-textualinfo(headerstoindicatetexttitle,author,speakers,etc)vsdetailedlinguisticannotation,Rawcorpus/cleantext,15,Example:

Asheweakened,Moranbecameafraidofhisdaughters.ThisoncepowerfulmanwassoimplantedintheirlivesthattheyhadneverreallyleftGreatMeadow,inspiteofjobsandmarriagesandchildrenandhousesoftheirowninDublinandLondon.Nowtheycouldnotlethimslipaway.,fromJohnMcGahernsAmongstWomen,Corpusannotation:

structuralandPOStagging,16,ExtractfromtheBritishNationalCorpus:

Asheweakened,Moranbecameafraidofhisdaughters.ThisoncepowerfulmanwassoimplantedintheirlivesthattheyhadneverreallyleftGreatMeadow,inspiteofjobsandmarriagesandchildrenandhousesoftheirowninDublinandLondon.Nowtheycouldnotlethimslipaway.fromJohnMcGahernsAmongstWomen,Buildingyourowncorpus-GettingtheTexts,17,FulltextsvstextsamplesSamplingrandomSamplingorhandpicking?

CopyrightPermissionRepresentativenessdifficultconcepttoapplytotextualdataonusisonresearchertodocumentcorpuscontentsverycarefully,Conversiontoelectronicform?

18,Textnotavailableinelectronicform:

Scanning+OCRvsKeyboardingTextavailableinelectronicformWebpages(seeKilgarriffetal2006)DownloadsfromwebsitesFull-textdatabasesDonationsfromauthors/translatorsFormattextsshouldbesavedin?

.txt?

.html?

.xml?

(knowwhatyoursoftwarecanhandle!

)Isalignmentnecessary?

DoingResearchUsingCorpora,19,Corporacanhelpusanswerquestions,buttheyalsogeneratequestionsIngeneral,passiveisusedaround10%ofthetimeinEnglish;

butnearly20%oftheusesofcauseasaverbinmyscientificcorpusareinthepassivevoiceDoesthistellmesomethingabouttheverbcause?

scientificEnglish?

mysampleofscientificEnglish?

Hypothesisgenerationandtesting,20,HypothesisatentativeclaimegcausetendstobeusedmoreinthepassivethanotherEnglishverbshypothesessometimesemergefromcorpusdatacorpus-drivenresearchhypothesessometimesformedbeforewelookatcorpusdatacorpus-basedresearchnewlygeneratedhypothesescanbetestedagainstother(oftenbigger)corporacyclesofhypothesisformation,testing,refinement,testingcommonincorpusresearch,Someearlydescriptivehypothesesintranslationstudies,21,Translationstendtobesimpler/moreexplicit/moreconventionalthan:

othertextsinthesamelanguagetheirsourcetexts,Buthowdoyouoperationalizenotionslikesimplification,explicitation,normalizationincorpus-basedresearch?

Eg,whatconcretefeaturesofatextshowittobesimplerthananothertext?

(Seeespecially,Baker1993,1995,1996),Earlyphase:

Developingmethodologiesandresources,22,EgQuestion:

Aretranslationsreallysimplified?

EgAnswer:

Comparedtowhat?

sourcetexts?

=parallelcorpusmethodologyothertextsinthetargetlanguage?

=comparablecorpusmethodology,ParallelCorpus,setofsourcetextsinlanguageAalongsidetheirtranslationsintolanguageB(andperhapslanguagesC,D,E)canbebilingualormultilingualegEnglish-NorwegianParallelCorpus(ENPC)http:

/www.hf.uio.no/ilos/forskning/forskningsprosjekter/enpc/egEuropeanParliamentProceedingsParallelCorpushttp:

/www.statmt.org/europarl/canbeunidirectionalorbidirectionalegGerman-EnglishParallelCorpusofLiteraryTextsde-,en,egENPCen-noandno-en,23,ComparableCorpus,24,MonolingualComparableCorpussetoftextstranslatedintoalanguageAalongsidetextsoriginallywritteninthatsamelanguageTranslationalEnglishCorpus(TEC)http:

/comparablesubsetoftheBNCFinnishComparableCorpus(Mauranen2004),EarlyCorpus-basedTS:

QuantitativeBias?

25,AttempttoapproachtranslationobjectivelyRelianceonpropertiesoftextthatcanbemeasuredAveragewordlengthAveragesentencelengthLexicaldensityType-tokenratio,etcFocusonmonolingualcomparablecorporaFocusongeneraltendencies(oruniversals)intranslation(Laviosa2002;

MauranenandKujamki2004),LexicalDensity,26,theratioofcontentwordstothetotalnumberofwordsinatextImadethechemicalshottersotheychanged.(4/8=50%)Raisingthetemperatureproducedachemicalchange.(5/7=71%)(fromGibbons2003:

20)thelowerthelexicaldensity,thesimplerthetext.,Type-TokenRatio,27,theratiooftypestotokensinacorpus,egtheTTRforThecatsatonthematnear,thelogfireis8/10(or80%)assumedtomeasurethe(lexical)varietyinatext:

thehighertheTTR,themorevariedthetextsvocabulary,TTR:

AnExample,28,InthispieceIrailagainstthetendencyoflinguiststowriteaboutthephilosophyofscienceasappliedtotheirsubjectfieldinsteadofwritingaboutwhatlanguagesarelike,whichiswhatlinguistsaresupposedtobegoodat.UnsympatheticcriticswillnodoubtchargethatbydoingsoIinstantiatetheverykindofbehaviorthatIamrailingagainst.FromPullum1991:

123TTR=49/63=78%,TTRcontinued,29,InthispieceIrailagainstthetendencyoflinguiststowriteaboutthephilosophyofscienceasappliedtotheirsubjectfieldinsteadofwritingaboutwhatlanguagesarelike,whichiswhatlinguistsaresupposedtobegoodat.UnsympatheticcriticswillnodoubtchargethatbydoingsoIinstantiatetheverykindofbehaviorthatIamrailingagainst.Thisisnotso.Iamcomplainingaboutunproductivemetaleveldiscussion,whichconsistsoflinguiststalkingaboutdoinglinguisticsinsteadofdoingit.Byofferingacritiqueofsuchwork,Iamoperatingatameta-metalevel,talkingaboutlinguiststalkingaboutdoinglinguisticsinsteadofdoingit.Thereisadifference.TTR=66/115=57%,Type/TokenRatio,30,isextremelysensitivetotextlengththelongerthetextis,thelowertheTTRis(normally)solution:

calculatetheTTRforsuccessivechunksoftexts(egevery100or1,000words),andthentakeanaveragecountattheend(standardizedTTR)butevenstandardizedTTRsareproblematicasamarkerofsimplicity/complexityastheycantcapturethedifferencebetweenhardwordsandeasywordsegfirevsconflagration,OperationalizingSimplification,31,Iftranslatedtextsaresomehowsimplerthanoriginaltextsinthesamelanguage,thentheymighthave:

shorteraveragewordlengthshorteraveragesentencelengthlowerlexicaldensitylowerstandardizedtype-tokenratioscomparedtooriginals,32,InitialResults:

InvestigatingSimplification(Laviosa1998a),33,Corpus:

newspaperarticlestranslatedintoEnglish,andnewspaperarticlesoriginallywritteninEnglishTool:

WordListinWordSmithToolsResultslexicaldensity:

lowerintranslatedarticlesaveragesentencelength:

lowerintranslatedarticlesstandardizedtype/tokenratio:

nosignificantdifference,InitialResults:

InvestigatingSimplification(Laviosa1998b),34,Corpus:

fictiontranslatedin

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > IT计算机 > 电脑基础知识

copyright@ 2008-2023 冰点文库 网站版权所有

经营许可证编号:鄂ICP备19020893号-2