GentooSed by example Part 3.docx

上传人:b****2 文档编号:13903718 上传时间:2023-06-19 格式:DOCX 页数:12 大小:20.65KB
下载 相关 举报
GentooSed by example Part 3.docx_第1页
第1页 / 共12页
GentooSed by example Part 3.docx_第2页
第2页 / 共12页
GentooSed by example Part 3.docx_第3页
第3页 / 共12页
GentooSed by example Part 3.docx_第4页
第4页 / 共12页
GentooSed by example Part 3.docx_第5页
第5页 / 共12页
GentooSed by example Part 3.docx_第6页
第6页 / 共12页
GentooSed by example Part 3.docx_第7页
第7页 / 共12页
GentooSed by example Part 3.docx_第8页
第8页 / 共12页
GentooSed by example Part 3.docx_第9页
第9页 / 共12页
GentooSed by example Part 3.docx_第10页
第10页 / 共12页
GentooSed by example Part 3.docx_第11页
第11页 / 共12页
GentooSed by example Part 3.docx_第12页
第12页 / 共12页
亲,该文档总共12页,全部预览完了,如果喜欢就下载吧!
下载资源
资源描述

GentooSed by example Part 3.docx

《GentooSed by example Part 3.docx》由会员分享,可在线阅读,更多相关《GentooSed by example Part 3.docx(12页珍藏版)》请在冰点文库上搜索。

GentooSed by example Part 3.docx

GentooSedbyexamplePart3

Disclaimer:

TheoriginalversionofthisarticlewasfirstpublishedonIBMdeveloperWorks,andispropertyofWesttechInformationServices.Thisdocumentisanupdatedversionoftheoriginalarticle,andcontainsvariousimprovementsmadebytheGentooLinuxDocumentationteam.

Thisdocumentisnotactivelymaintained.

Sedbyexample,Part3

1. Takingittothenextlevel:

Datacrunching,sedstyle

Muscularsed

Inmysecondsedarticle,Iofferedexamplesthatdemonstratedhowsedworks,butveryfewoftheseexamplesactuallydidanythingparticularlyuseful.Inthisfinalsedarticle,it'stimetochangethatpatternandputsedtogooduse.I'llshowyouseveralexcellentexamplesthatnotonlydemonstratethepowerofsed,butalsodosomereallyneat(andhandy)things.Forexample,inthesecondhalfofthearticle,I'llshowyouhowIdesignedasedscriptthatconvertsa.QIFfilefromIntuit'sQuickenfinancialprogramintoanicelyformattedtextfile.Beforedoingthat,we'lltakealookatsomelesscomplicatedyetusefulsedscripts.

Texttranslation

OurfirstpracticalscriptconvertsUNIX-styletexttoDOS/Windowsformat.Asyouprobablyknow,DOS/Windows-basedtextfileshaveaCR(carriagereturn)andLF(linefeed)attheendofeachline,whileUNIXtexthasonlyalinefeed.TheremaybetimeswhenyouneedtomovesomeUNIXtexttoaWindowssystem,andthisscriptwillperformthenecessaryformatconversionforyou.

CodeListing 1.1:

FormatconversionbetweenUNIXandWindows

$sed-e's/$/\r/'myunix.txt>mydos.txt

Inthisscript,the'$'regularexpressionwillmatchtheendoftheline,andthe'\r'tellssedtoinsertacarriagereturnrightbeforeit.Insertacarriagereturnbeforealinefeed,andpresto,aCR/LFendseachline.Pleasenotethatthe'\r'willbereplacedwithaCRonlywhenusingGNUsed3.02.80orlater.Ifyouhaven'tinstalledGNUsed3.02.80yet,seemyfirstsedarticleforinstructionsonhowtodothis.

Ican'ttellyouhowmanytimesI'vedownloadedsomeexamplescriptorCcode,onlytofindthatit'sinDOS/Windowsformat.Whilemanyprogramsdon'tmindDOS/WindowsformatCR/LFtextfiles,severalprogramsdefinitelydo--themostnotablebeingbash,whichchokesassoonasitencountersacarriagereturn.ThefollowingsedinvocationwillconvertDOS/WindowsformattexttotrustyUNIXformat:

CodeListing 1.2:

ConvertingCcodefromWindowstoUNIXformat

$sed-e's/.$//'mydos.txt>myunix.txt

Thewaythisscriptworksissimple:

oursubstitutionregularexpressionmatchesthelastcharacterontheline,whichhappenstobeacarriagereturn.Wereplaceitwithnothing,causingittobedeletedfromtheoutputentirely.Ifyouusethisscriptandnoticethatthelastcharacterofeverylineoftheoutputhasbeendeleted,you'vespecifiedatextfilethat'salreadyinUNIXformat.Noneedforthat!

Reversinglines

Here'sanotherhandylittlescript.Thisonewillreverselinesinafile,similartothe"tac"commandthat'sincludedwithmostLinuxdistributions.Thename"tac"maybeabitmisleading,because"tac"doesn'treversethepositionofcharactersontheline(leftandright),butratherthepositionoflinesinthefile(upanddown).Tacingthefollowingfile:

CodeListing 1.3:

Samplefile

foo

bar

oni

....producesthefollowingoutput:

CodeListing 1.4:

Outputfile

oni

bar

foo

Wecandothesamethingwiththefollowingsedscript:

CodeListing 1.5:

Doingsamewithscript

$sed-e'1!

G;h;$!

d'forward.txt>backward.txt

You'llfindthissedscriptusefulifyou'reloggedintoaFreeBSDsystem,whichdoesn'thappentohavea"tac"command.Whilehandy,it'salsoagoodideatoknowwhythisscriptdoeswhatitdoes.Let'sdissectit.

Reversalexplained

First,thisscriptcontainsthreeseparatesedcommands,separatedbysemicolons:

'1!

G','h'and'$!

d'.Now,it'stimetogetangoodunderstandingoftheaddressesusedforthefirstandthirdcommands.Ifthefirstcommandwere'1G',the'G'commandwouldbeappliedonlytothefirstline.However,thereisanadditional'!

'character--this'!

'characternegatestheaddress,meaningthatthe'G'commandwillapplytoallbutthefirstline.Forthe'$!

d'command,wehaveasimilarsituation.Ifthecommandwere'$d',itwouldapplythe'd'commandtoonlythelastlineinthefile(the'$'addressisasimplewayofspecifyingthelastline).However,withthe'!

','$!

d'willapplythe'd'commandtoallbutthelastline.Now,allweneedtotoisunderstandwhatthecommandsthemselvesdo.

Whenweexecuteourlinereversalscriptonthetextfileabove,thefirstcommandthatgetsexecutedis'h'.Thiscommandtellssedtocopythecontentsofthepatternspace(thebufferthatholdsthecurrentlinebeingworkedon)totheholdspace(atemporarybuffer).Then,the'd'commandisexecuted,whichdeletes"foo"fromthepatternspace,soitdoesn'tgetprintedafterallthecommandsareexecutedforthisline.

Now,linetwo.After"bar"isreadintothepatternspace,the'G'commandisexecuted,whichappendsthecontentsoftheholdspace("foo\n")tothepatternspace("bar\n"),resultingin"bar\n\foo\n"inourpatternspace.The'h'commandputsthisbackintheholdspaceforsafekeeping,and'd'deletesthelinefromthepatternspacesothatitisn'tprinted.

Forthelast"oni"line,thesamestepsarerepeated,exceptthatthecontentsofthepatternspacearen'tdeleted(duetothe'$!

'beforethe'd'),andthecontentsofthepatternspace(threelines)areprintedtostdout.

Now,it'stimetodosomepowerfuldataconversionwithsed.

sedQIFmagic

Forthelastfewweeks,I'vebeenthinkingaboutpurchasingacopyofQuickentobalancemybankaccounts.Quickenisaverynicefinancialprogram,andwouldcertainlyperformthejobwithflyingcolors.But,afterthinkingaboutit,IdecidedthatIcouldeasilywritesomesoftwarethatwouldbalancemycheckbook.Afterall,Ireasoned,I'masoftwaredeveloper!

Idevelopedanicelittlecheckbookbalancingprogram(usingawk)thatcalculatesbybalancebyparsingatextfilecontainingallmytransactions.Afterabitoftweaking,IimproveditsothatIcouldkeeptrackofdifferentcreditanddebitcategories,justlikeQuickencan.But,therewasonemorefeatureIwantedtoadd.IrecentlyswitchedmyaccountstoabankthathasanonlineWebaccountinterface.Oneday,Inoticedthatmybank'sWebsiteallowedmetotodownloadmyaccountinformationinQuicken's.QIFformat.Inverylittletime,IdecidedthatitwouldbereallyneatifIcouldconvertthisinformationintotextformat.

Ataleoftwoformats

BeforewelookattheQIFformat,here'swhatmycheckbook.txtformatlookslike:

CodeListing 1.6:

SampleofQIFformat

28Aug2000food--YSupermarket30.94

25Aug2000watr-103YCheck10352.86

Inmyfile,allfieldsareseparatedbyoneormoretabs,withonetransactionperline.Afterthedate,thenextfieldliststhetypeofexpense(or"-"ifthisisanincomeitem).Thethirdfieldliststhetypeofincome(or"-"ifthisisanexpenseitem).Then,there'sachecknumberfield(again,"-"ifempty),atransactionclearedfield("Y"or"N"),acommentandadollaramount.Now,we'rereadytotakealookattheQIFformat.WhenIviewedmydownloadedQIFfileinatextviewer,thisiswhatIsaw:

CodeListing 1.7:

Malformedfileoutput

!

Type:

Bank

D08/28/2000

T-8.15

N

PCHECKCARDSUPERMARKET

^

D08/28/2000

T-8.25

N

PCHECKCARDPUNJABRESTAURANT

^

D08/28/2000

T-17.17

N

PCHECKCARDSUPERMARKET

Afterscanningthefile,wasn'tveryhardtofigureouttheformat--ignoringthefirstline,theformatisasfollows:

CodeListing 1.8:

Fileformat

D

T

N

P

^

(thisisthefieldseparator)

Startingtheprocess

Whenyou'retacklingasignificantsedprojectlikethis,don'tgetdiscouraged--sedallowsyoutograduallymassagethedataintoitsfinalform.Asyouprogress,youcancontinuetorefineyoursedscriptuntilyouroutputappearsexactlyasintended.Youdon'tneedtogetitexactlyrightonthefirsttry.

Tostartoff,Icreatedafilecalledqiftrans.sed,andstartedmassagingthedata:

CodeListing 1.9:

qiftrans.sed

1d

/^^/d

s/[[:

cntrl:

]]//g

Thefirst'1d'commanddeletesthefirstline,andthesecondcommandremovesthosepesky'^'charactersfromtheoutput.Thelastlineremovesanycontrolcharactersthatmayexistinthefile.SinceI'mdealingwithaforeignfileformat,Iwanttoeliminatetheriskofencounteringanycontrolcharactersalongtheway.Sofar,sogood.Now,it'stimetoaddsomeprocessingpunchtothisbasicscript:

CodeListing 1.10:

Improvedbasicscript

1d

/^^/d

s/[[:

cntrl:

]]//g

/^D/{

s/^D\(.*\)/\1\tOUTY\tINNY\t/

s/^01/Jan/

s/^02/Feb/

s/^03/Mar/

s/^04/Apr/

s/^05/May/

s/^06/Jun/

s/^07/Jul/

s/^08/Aug/

s/^09/Sep/

s/^10/Oct/

s/^11/Nov/

s/^12/Dec/

s:

^\(.*\)/\(.*\)/\(.*\):

\2\1\3:

}

First,Iadda'/^D/'addresssothatsedwillonlybeginprocessingwhenitencountersthefirstcharacteroftheQIFdatefield,'D'.Allofthecommandsinthecurlybraceswillexecuteinorderassoonassedreadssuchalineintoitspatternspace.

Thefirstlineinthecurlybraceswilltransformalinethatlookslike:

CodeListing 1.11:

Firstlinebeforechange

D08/28/2000

intoonethatlookslikethis:

CodeListing 1.12:

Firstlineafterchange

08/28/2000OUTYINNY

Ofcourse,thisformatisn'tperfectrightnow,butthat'sOK.We'llgraduallyrefinethecontentsofthepatternspaceaswego.Thenext12lineshavetheneteffectoftransformingthedatetoathree-letterformat,withthelastlineremovingthethreeslashesfromthedate.

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 经管营销 > 经济市场

copyright@ 2008-2023 冰点文库 网站版权所有

经营许可证编号:鄂ICP备19020893号-2