Bash Wildcards - 8 Practical Examples

Globbing is nothing but path-name expansion and wildcards are used to perform globbing on Linux command line. Here, in this article, we will discuss how wild-cards can be used to make life easy for a command line user. Please note that it is the shell (bash in this case) that does the required globbing. Neither OS, nor the executed program processes wild-cards in any way.
NOTE – All the examples in this article are tested on bash shell in Ubuntu 13.04.

Bash Wildcards - Simple Globbing

In this section, we will discuss simple examples of the following four commonly used bash wildcards:
  • Asterisk – *
  • Question Mark – ?
  • Square Brackets – []
  • Curly Brackets – {}

Q1. Is there something that represents any number of characters?

Ans. Yes. An asterisk (*) can be replaced with any number of characters. It is one of the most popular wildcard character and is extremely useful in cases where you don’t care what replaces it as long as the main search condition is true. For example – if you want to list all the files that have a .c extension then you can use *.c as argument to ls command. Here, file name does not matter and so an asterisk is used as a wildcard for file-name.
Here is an example screen-shot:
asterisk
So you can see that the asterisk (*) wildcard was replaced by the file names.

Q2. Is there something that represents only one character?

Ans. Yes, a question mark (?) can be replaced with a single character. This wildcard is useful in scenarios where file names are similar except a single character. For example, if there are 10 log files (log1.txt, log2.txt and so on) then to list down the details of all the files or to perform any other operation collectively on these files you can use log?.txt.
Here is an example screen-shot :
questionMark
So you can see that the question mark (?) wildcard was replaced by a single character (a number in this example).

Q3. Is there something that represents a range of characters?

Ans. Yes, square brackets can be used to specify a particular range. This wildcard is especially useful when it is desired to filter out file names based on a set of characters. For example, to delete only log1.txt, log2.txt and log3.txt out of all the five log files (created in previous example), just use log[1-3].txt.
Here is an example screen-shot :
square-bracket
This command should delete log1.txt, log2.txt and log3.txt. Lets confirm it using the question mark wildcard that was discussed in previous example:
question-mark
So you can see that only log4.txt and log5.txt are listed in the output. This means that the other 3 files were successfully removed through square bracket wildcard [].

Q4. Is there something that cut shorts repetitive work on command line?

Ans. Yes, curly brackets {}  can be used to specify only the variable parts (separated by comma) of multiple commands and hence they cut short writing long or repetitive commands. For example, to create three files log1, log2 and log3, either you will write touch command thrice or you will write something like touch log1.txt log2.txt log3.txt. But, this effort can be further cut short by using curly brackets. Just write touch log{1,2,3}.txt.
Here is an example screen-shot :
curlyBrackets
This command should create three files – log1.txt, log2.txt and log3.txt. Lets confirm it using ls command :
question-mark-1
So you can see that all the three files were created successfully.

Bash Wildcards - Extended Globbing

In this section, we will discuss some advanced examples of globbing through pattern list matching using the following patterns:
  • ?(… | …)   -  Match zero or one occurrence of the items listed between the pipes (|)
  • *(… | …)   -  Match zero or more occurrence of the items listed between the pipes (|) 
  • +(… | …)   -  Match one or more occurrence of the items listed between the pipes (|)
  • !(… | …)   -  Match anything except the items listed between the pipes (|)

Q1. How to search one or more occurrence of files based on initial few characters of the file names?

Ans. The pattern +(…|…) can be used in this case. Here is an example screen-shot :
pattern+
So you can see that the pattern +(fi|fr)* was used with ls command to produce the desired results. If you try to break down the expression then +(fi|fr) represents one or more occurrence of files having fi or fr as initial characters in names and the trailing asterisk (*) can be replaced with any number of characters i.e., effectively the remaining part of file names.
Similarly you can use ?(…|…) to match zero or one items and *(…|…) to match zero or more items present between pipes (|).

Q2. How to search all the file names except some files that have a known name pattern?

Ans. To discuss the solution to this problem, you can take the file name pattern from the previous example. So, the task is to display names of all the files (present in current directory) while filtering out those having fi or fr as initial characters in names. The pattern !(…|…) can be used in this case.
Here are all the contents that are present in current directory :
dir-contents
Here is the command that filters the required file names and displays the rest:
negate
So you can see that the wildcard expression !(…|…) can be used for filtering purpose.

Important Points To Remember

1. Do check the man page of grep command to explore predefined named classes of characters within bracket expressions.
Here is a screen-shot from the man page of grep command:
grep
Here is a brief information about these predefined classes :
  • [:upper:]  -> Match uppercase characters
  • [:lower:]   -> Match lowercase characters
  • [:alpha:]   -> A super set of [:upper:] + [:lower:]
  • [:digit:]     -> Match decimal numbers ie 0 to 9
  • [:alnum:]  -> A super set of [:alpha:] + [:digit:]
  • [:space:]   -> Match spaces, newlines, tabs etc
  • [:graph:]  -> Match characters (excluding space) that are graphically printable
  • [:print:]    -> Match characters (including space) that are printable
  • [:punct:]   -> Match punctuation characters ie [:graph:] – ([:alpha:] + [:digit:])
  • [:cntrl:]    -> Match non-printable control characters
  • [:xdigit:]  -> Match hexadecimal digits
Here is an example that demonstrates the usage of these predefined named classes of characters:
digit-character-set
So you can see that the class digit matched all the decimal numbers.
2. All the wild-card techniques described here are bash shell specific. Keep this in mind especially if you are a programmer who wants to use wildcards from within his/her program code(using exec() family of functions) as wild card expansions may not work if the program is executed in non-bash shells.

0 comments:

Post a Comment