grep per ignorare i pattern | Complex Solutions

Sto estraendo URL da un sito web utilizzando cURL come di seguito.

curl www.somesite.com | grep "<a href=.*title=" > new.txt

Il mio file new.txt è il seguente.

<a href="http://website1.com" title="something"> <a href="http://website1.com" information="something" title="something"> <a href="http://website2.com" title="some_other_thing"> <a href="http://website2.com" information="something" title="something"> <a href="http://websitenotneeded.com" title="something NOTNEEDED">

Tuttavia, ho bisogno di estrarre solo le informazioni seguenti.

<a href="http://website1.com" title="something"> <a href="http://website2.com" information="something" title="something">

Sto cercando di ignorare <a href che ha informazioni al loro interno e il cui titolo termina con NOTNEEDED .

Come posso modificare la mia istruzione grep?

Commenti

Loutput è ' stai mostrando qui corretto? Il testo che lo descrive non ' ha senso insieme a questo esempio.
Aren ' non stai cercando curl www.somesite.com | grep "<a href=.*title=" | grep -v NOTNEEDED > new.txt?
@terdon, era esattamente quello che stavo cercando. Posso accettarlo come risposta se lo pubblichi.
Ramesh, ' è fondamentalmente @slm ' è la risposta. Lho appena modificato in modo che tu possa accettarlo.
oh sì, non mi ero reso conto che pipe fosse così potente. Lho accettato come risposta. Grazie!

Risposta

Non sto seguendo completamente il tuo esempio + la descrizione ma suona come te voglio è questo:

$ grep -v "<a href=.*title=.*NOTNEEDED" sample.txt <a href="http://website1.com" title="something"> <a href="http://website1.com" information="something" title="something"> <a href="http://website2.com" title="some_other_thing"> <a href="http://website2.com" information="something" title="something">

Quindi, per il tuo esempio:

$ curl www.example.com | grep -v "<a href=.*title=" | grep -v NOTNEEDED > new.txt

Commenti

Ho una classe nella < sezione href. Fondamentalmente, non la voglio nel mio output.

Answer

La pagina man grep dice:

-v, --invert-match Invert the sense of matching, to select non-matching lines. (-v is specified by POSIX .)

Puoi utilizzare espressioni regolari per più inversioni:

grep -v "red\|green\|blue"

grep -v red | grep -v green | grep -v blue

Commenti

Risposta

Commenti

Answer

Lascia un commento Annulla risposta