Ho un file di testo della cronologia della posta di 200 MB di grandi dimensioni e devo trovare una riga che abbia la seguente struttura:

lastname name streetname numberOfHouse postalcode cityname 


Arthur Dent Galaxy 7 74369 Third Orbit 

Nota : il codice postale contiene sempre 5 numeri e il nome della città può essere di una o due parole contenenti lettere maiuscole e minuscole dellalfabeto. Cognome, nome, nome della via sono solo una parola contenente lettere maiuscole e minuscole dellalfabeto. Non ci sono informazioni aggiuntive fornite.

La mia soluzione fino ad ora non restituisce nulla e appare semplicemente una nuova riga di prompt:

grep -P "[a-zA-Z]+ [a-zA-Z]+\n[a-zA-Z]+ [0-9]+\n\d{5} [a-zA-Z]+ [a-zA-Z]+" "/home/jublikon/Downloads/emails" 

Un estratto del file:

Received: from outmail-1.st1.spray.net (outmail-1.st1.spray.net []) by pigsty.hamjudo.com (8.12.1/8.12.1/Debian -5) with ESMTP id g58Jd3Y9015868 for ...; Sat, 8 Jun 2002 15:39:10 -0400 Date: Sat, 8 Jun 2002 15:39:03 -0400 Received: from lycos.co.uk (newwww-2.st1.spray.net []) by outmail-1.st1.spray.net (8.8.8/8.8.8) with SMTP id VAA09339; Sat, 8 Jun 2002 21:36:57 +0200 (DST) Posted-Date: Sat, 8 Jun 2002 21:36:57 +0200 (DST) From: Sandra Savimbi [email protected] To: [email protected] Message-ID: [email protected] X-Mailer: Caramail - www.caramail.com X-Originating-IP: [] Mime-Version: 1.0 Subject: Kindly Get Back To Me Please. Content-Type: multipart/mixed; Case there are no risks involved. REPLY ASAP. With regards. Dr.Raymond Graham(JP). TEL 00-228-949-7287. _____________________________________________________________ To meet someone --- http://www.domeconnection.com Get free new car price quotes http://autos.yahoo.com </pre><hr> Another similar one from September 17 (mail headers not provided]</p> <pre> FROM: COL.ZIZO GIRAI(RTD)DEMOCRATIC REPUBLIC OF NIGERIA, SECURED AS CREDIT/PAYMENT TO A FOREIGN ACCOUNT FOR US ALL. BY OUR APPLICATION, IT WILL BE SET ASIDE FOR INCIDENTAL EXPENSES (INTERNAL AND EXTERNAL) BETWEEN BOTH PARTIES IN THE TRANSFER, AND IN VIEW OF OUR SONS DUKE AND BASHER OUT OF THE RELEASE OF THE COMMUNITIES AND PEOPLE IN POWER, THE FERDERAL ARMY WAS SENT TO THE MANAGER OF UNITED NATIONS EVACUATION TEAM WHERE WE SHALL FINALLY TRANSFER THE TOTAL AMOUNT FOR YOUR ASSISTANCE AS TO MAINTAIN THE ABSOLUTE SUPPORT OF ALL PREVIOUS MILITARY GOVERNMENTS.CONTACT SHOULD BE CONFIDENTIAL. Best Regards, Dr. Mrs. Marian Sani Abacha. My colleagues and I really thank God that you keep your winning information confidential until you receive this money, as long as the original contractor, leaving behind his 11 year old son, Mike,who managed to sneak out of Congo, I immediately decided to contact you, and this is why I need a reliable foreign non-company account to receive such funds. More so, we are assuring you of the total sum, 60% of the announcement today 28th of February 2004. After this date, all funds will be for you. Firstly you can request for the reconciliation of all claims that have not met before.I maintain the theory that business is just what we require you to understand that the money is my reason for contacting you as a family treasure. It is our hope that you will provide will then proceed to Netherlands is safe in doing this transaction as this is due to the ownership of the witch-hunting search light of the country. I shall be revealed to me at once via email as stated above. Therefore, to enable us provide a bank in 1990 and since 1993 nobody has operated on this account again, after going through the National Oil Nigeria PLC (N/Oil) and member of my father,and with the time of writing, no next of kin was fruitless. Itherefore made further investigation and discovered that Mr. Barry Kelly did not tell people or your Company will retain 20% of the application that you will remain honest to me as soon as we have some questions or refuse the money while the rest and do not know or ever seen before, but I want to transfer this money abroad in a position to make me not to tell anybody except my mother receive this fund since no one else we can transfer this money to your response as soon as possible. Congratulations once more from our members of staff and thank you for being part of your winning,you will take part in our promotional program. Note: Anybody under the age of 30, or a reliable and honest person to handle this transaction would be released and transferred the money a. Regards, Sandra Savimbi Arthur Dent Galaxy 7 74369 Third Orbit Date: Sun, 9 Jun 2002 05:23:27 +0200 From: IBRAHIM ALI [email protected] To: [email protected] Subject: URGENT FOR INVESTMENT From the Desk of MR IBRAHIM ALI NIGERIAN NATIONAL PETROLEUM CORPORATION LAGOS NIGERIA. ATTN: MANAGING DIRECTOR(CEO) . Your contact was given to me by a friend who was once on diplomatic mission in your country upon my enquiry for a reliable firm to engage in business. The same guaranteed your reliability and trust-worthiness in business matters. I therefore wish to explain this lucrative business intention for our mutual benefit, though I did not let that friend have the real Of it"s swiftness and confidentiality. Also, your area of specialization is not money from the very several, but due to the actualisation of the on-going liquefied Natural Gas Resources for domestic use and Export Market. In 1995, a consortium of Engineering firms, Technip, Snamprogetti, M. W. Kellog and Japan Gas Corporation of South Africa does not allow us to commence the process of collecting your prize. You are also advised to keep this award top secret because of our funds from the project. We now want you to stand in as the right channels of executing this venture successfully. And as civil servants and we will be entitled to 15% of the Monroe`s family or relatives but to no Avail. Should you be willing to pay it into your account. I will send to you as my late husband had, [wealth] belongs to one of his available foreign next of kin,the company awaits my coming for the Total sum for all kinds of expenses incurred in the bank has been processed and your money remitted to your nominated account overseas, while 5% will be carefully worked out with the late beneficiary or for high profile investment purposes before his death. The last installments due has been made for the family, the family intend to use it for our mutual benefit. REMUNERATION. We have decided to contact his Next of kin to Mr. Barry Kelly did not bear any male child [heir apparent] for my future and those of us because I will give to you, while 5% will be set aside for any arising contigencies during the process of transferring. I look forward to receiving your prompt reply-BENSON OKA. __________________________________________________ Do You Yahoo!? Sign up for sake of unfree environment.during my brother"s stay in Sierra-Leone was no where to be kept aside to defray all expenses that might be of great essence in this transaction through the International Telephone Operator or (AT&T) when lines are busy at any time, upon receipt of your lottery winningbelongs to your country during a domestic flight on February 24, 1999. Until his death months ago in Kenya Air Bus (A310 - 300) Flight Kq430, Banked with us or our designated agent. Congratulations once again from all our staffs and thanks for being part of our end of the Government. I was able to manage whatever business venture you deem fit to use the funds in our lottery promotional program. held on the 24th of January 2004. Your e-mail address attached to the expiration of 5 (five) years, the money wisely while i go back to me your full names or in the land dispute in my Bank. This sum of money coded for safe keeping. I will regularize all the white-owned farms for his money because we are going to come over and put claims for this transaction. I have to entrust my futu re and that is so traumatized, I have been able to claim this fund to his forwarding address but got reply. Regards, IBRAHIM ALI the Desk Subject: HE CARES FOR THOSE WHO TRUST IN HIM Date: Mon, 10 Jun 2002 22:28:24 +0200 From: Mrs Rose Sankoh [email protected] Reply-To: Mrs Rose Sankoh [email protected] To: [email protected] HE CARES FOR THOSE WHO TRUST IN HIM FROM: MRS. ROSE SANKOH E-MAIL: [email protected] Dearest, I want to confide in you knowing that you are in the vineyard of God and you may not have the mind to do otherwise when it finally materialise. With due respect and humility I write you this letter with the belief that you would be very much obliged to assist us. Since we have no place or IN SOUTH AFRICA. We would file a claim to reflect payment and we hope to use your company"s name to apply for the proper channels. Be assured that this money within a very strong Assurance and guarantee that our conversation can be assumed that the incumbent President Charles Taylor Liberia,a country in cash credited to file REF N: EGS/2551256003/03. This is why I am convinced that you could accept to assist us in your hands if you are capable and fit to use you as my partner will handle it with utmost secrecy and confidentiality that it is our hope that we could transfer the account died without a written or oral WILL and to make the payment of Contract jobs done for security reason, Furnish me with your private telephone and fax number full name and account,where the money although the war against the legitimate Government in my possession and I am writing this letter to you, additional information before we fly to your country . This money was personally kept by then President, LAURENT KABILA, without the consent of this, your US$2,500,000.00 (Two million,Five Hundred thousand United States dollars)in one security company insured in your REFERENCE FILE. Due to the point, this money will be well protected. This business proposal for you. On December 6, 1999, a Foreign Account requiring Maximum Confidence. THE PROPOSITION: A Foreigner, a french, Late Engr.Jean claude Pierre (Snr) a merchant in Dubai, in the Netherlands from South Africa. We will then come over and put to use. My hopes was turn down as it came with the responsibility to ensure maximum confidentiality and trust is my share of the American government which has already done this deal have been exercising patience for this project can either be personal, company or an offshore payment account of yours,where it can be able to secured some Reasonable amount of money out urgently it will be willing to assist me and 40% to you, additional information (Bio data) on Mr. Bantam. I am the only person that will enable me to give you my word that you promise to give you instructions on what I was desperately looking for a liberation movement like UNITA hence the money in company, I have with me, please contact your file/claim officer: GARVIN MARCUS. FOREIGN SERVICE MANAGER, Email : [email protected] Telephone :+31-620-885-334. For due processment and remittance of your discreetness and ability in transaction of this transaction. Please, your assistance by acting as our new found parent/family and will meet up with them in the 1st category, you have therefore been delegated as a surprise because we are prohibited by the Rebels of R.U.F that has been processed and the distribution of it will enable me fax to you by fax or email at any time. The remainder of the contractors awaiting payment for consultancy services rendered by you. If this proposal is 100% risk free as we have identified a huge sum of $18,000,000 USD in cash, not bankable, which retained. Regards, Mrs Rose From: "alex princewill" [email protected] To: [email protected] Sent: Tuesday, June 11, 2002 1:51 PM Subject: INVESTMENT PROPOSAL/ TO AUDITING AND ACCOUNT UNIT. FORIEGN REMITTANCE DEPT. UNION TOGOLAISE DU BANQUE LOME-TOGO.IN WEST AFRICA. Attn, I am Mr.Alex princewill. the director in charge of auditing and account section of Union Togolaise Du Banque Lome-Togo with due respect and regard. I have decided to contact you on a business 

Domanda : erano potrebbe essere un mio errore?


  • Anche con il flag -P, grep non ' t eseguire corrispondenze multilinea AFAIK (almeno, non senza hack come lo slurping dellintero file trattandolo come terminato da null). Inoltre, il tuo estratto contiene effettivamente una corrispondenza? In caso contrario, ' è di uso limitato come test case.
  • ho aggiornato il mio post, quindi ci sarà una corrispondenza
  • Dovè la corrispondenza '? Mostraci loutput desiderato come ti ho chiesto di fare in chat.
  • @terdon è lì. la linea cioè lo mostra evidenziato. La partita è al centro dellestratto. ' Arthur Dent Galaxy 7 74369 Third Orbit '
  • E come definisci i nomi e i numeri delle strade? Hai Galaxy 7 ma per quanto riguarda 4 Privet drive o All hail johnny lane 14-12 ecc.?


In primo luogo non ci sono dati corrispondenti nel tuo file che stai mostrando qui. E supponendo che tu abbia trovato un file con il tipo di dati che intendi cercare, dovresti richiamare grep con le opzioni:

grep -zoP 

-z tratterà il file come una stringa enorme. -o ti darà solo la parte corrispondente -P abiliterà il motore di regex Perl facendo in modo che grep comprenda il tipo di regex che hai lì.


  • Ho aggiornato il mio post in modo che i dati che sto cercando siano lì.
  • sì, funziona. Ho ragione che non cè alcuna possibilità di ottenere il numero di riga del file? Perché ' -nr ' stampa solo 1 come risultato che non è corretto
  • Volevo solo assicurarmi che . Se il file è uniline, i \n non sono più necessari, non sono '? Perché per quanto ne so \n significa interruzione di riga. Daltra parte il comando grep non funziona senza la risposta \n


Il codice postale sembra in qualche modo univoco, quindi se nel file non cè nulla che assomigli a uno ma non” t, potremmo semplicemente grep -B per ottenere le righe precedenti :

$ grep -B2 -Ee "^[0-9]{5} " spam Arthur Dent Galaxy 7 74369 Third Orbit 

es cerca cinque cifre e uno spazio allinizio della riga, stampa quella riga e le due precedenti. (-B sta per “before”, -A per “after”, -C per contesto o in entrambe le direzioni.)


Suggerirei di utilizzare pcregrep piuttosto che semplice grep per questo, ad esempio

pcregrep -M "[a-zA-Z]+ [a-zA-Z]+\n[a-zA-Z]+ [0-9]+\n\d{5} [a-zA-Z]+ [a-zA-Z]+" "/home/jublikon/Downloads/emails" 

Se vuoi il numero di riga, allora pcregrep supporta la stessa -n opzione di grep ad es.

$ pcregrep -nM "[a-zA-Z]+ [a-zA-Z]+\n[a-zA-Z]+ [0-9]+\n\d{5} [a-zA-Z]+ [a-zA-Z]+" emails 66:Arthur Dent Galaxy 7 74369 Third Orbit 

Lascia un commento

Il tuo indirizzo email non sarà pubblicato. I campi obbligatori sono contrassegnati *