Linux – How To use SED to delete a string with Wildcards

Scenario

File1:

<a>hello</b> <c>foo</d> <a>world</b> <c>bar</d> 

Is an example of the file this would work on. How can one remove all strings which have a <c>*</d> using sed?

Solution

The following line will remove all text from <c> to </d> inclusive:

sed -e 's/<c>.*<\/d>//' 

The bit inside the s/...// is a regular expression, not really a wildcard in the same way as the shell uses, so anything you can put in a regular expression you can put in there.

The complete command to use so it's:

sed -e 's/<c>.*<\/d>//' input.txt > output.txt

Example

If you want to execute multiple removing command or other command types through REGEX you have to use this strategy:

sed -e "s/^\[.*\]//" -e "s/^Found*.*//" -e "s/.*hosts*.*//" in.txt > out.txt
SOURCE

LINK

LANGUAGE
ENGLISH