grep pour ignorer les modèles

Jextraction des URL dun site Web en utilisant cURL comme ci-dessous.

curl www.somesite.com | grep "<a href=.*title=" > new.txt

Mon nouveau fichier.txt est comme ci-dessous.

<a href="http://website1.com" title="something"> <a href="http://website1.com" information="something" title="something"> <a href="http://website2.com" title="some_other_thing"> <a href="http://website2.com" information="something" title="something"> <a href="http://websitenotneeded.com" title="something NOTNEEDED">

Cependant, je nai besoin dextraire que les informations ci-dessous.

<a href="http://website1.com" title="something"> <a href="http://website2.com" information="something" title="something">

Jessaie dignorer les <a href qui ont informations et dont le titre se termine par NOTNEEDED .

Comment puis-je modifier mon instruction grep?

Commentaires

Le résultat est-il ' est affiché ici, correct? Le texte le décrivant na ' pas de sens avec cet exemple.
Aren ' vous recherchez curl www.somesite.com | grep "<a href=.*title=" | grep -v NOTNEEDED > new.txt?
@terdon, cest exactement ce que je cherchais. Je peux laccepter comme réponse si vous la publiez.
Ramesh, elle ' est essentiellement @slm ' réponse de s. Je viens de le modifier pour que vous puissiez laccepter.
oh oui, je ne savais pas que le tuyau était aussi puissant. Je lai accepté comme réponse. Merci!

Réponse

Je « ne suis pas entièrement votre exemple + la description mais cela ressemble à ce que vous voulez est ceci:

$ grep -v "<a href=.*title=.*NOTNEEDED" sample.txt <a href="http://website1.com" title="something"> <a href="http://website1.com" information="something" title="something"> <a href="http://website2.com" title="some_other_thing"> <a href="http://website2.com" information="something" title="something">

Donc pour votre exemple:

$ curl www.example.com | grep -v "<a href=.*title=" | grep -v NOTNEEDED > new.txt

Commentaires

Jai une classe dans la < une section href. Fondamentalement, je ne veux pas de cela dans ma sortie.

Réponse

La page de manuel grep dit:

-v, --invert-match Invert the sense of matching, to select non-matching lines. (-v is specified by POSIX .)

Vous pouvez utiliser des expressions régulières pour plusieurs inversions:

grep -v "red\|green\|blue"

grep -v red | grep -v green | grep -v blue

Commentaires

Réponse

Commentaires

Réponse

Laisser un commentaire Annuler la réponse