Wild cards
It's often boring to write all file names on a command line to give them as argument to a command.
like:
cp genome_1.fasta genome_2.fasta genome_3.fasta all_genomes
There is in shell some special characters called wildcards that match several characters.
There is mainly 4 kinds of wildcards:
- the asterisk: *
- the question mark ?
- the square brackets []
- the curly brackets {}
Asterisk Wildcard "*"
"*" is the most used wildcard, it matches no characters, any single characters or any string.
Question Mark Wildcard "?"
"?" that matches for exactly a single character.
Square Brackets Wildcard "[]"
Square brackets "[]" let you matches a range of values.
For instance [abcd] match any character (only one) among a, b, c, d
Instead of enumerating all allowed characters, we can used range [a-f]
You can negate characters enclosed in Square brackets, using [^] => will match any character not in the list.
Warning
when you use a range, it uses the lexicographic order, so the result can be different,
according to your locals.
create a sandbox
~/XX$ mkdir XX
~/XX$ cd XX
~/XX$ touch {a,A,b,B}{01..10}
try with locals C
~/XX$ export LC_ALL=C
~/XX$ ls
A01 A03 A05 A07 A09 B01 B03 B05 B07 B09 a01 a03 a05 a07 a09 b01 b03 b05 b07 b09
A02 A04 A06 A08 A10 B02 B04 B06 B08 B10 a02 a04 a06 a08 a10 b02 b04 b06 b08 b10
~/XX$ ls -d [a-b]*
a01 a02 a03 a04 a05 a06 a07 a08 a09 a10 b01 b02 b03 b04 b05 b06 b07 b08 b09 b10
~/XX$ ls -d [ab]*
a01 a02 a03 a04 a05 a06 a07 a08 a09 a10 b01 b02 b03 b04 b05 b06 b07 b08 b09 b10
now change your local
~/XX$ export LC_ALL=en_US.UTF-8
~/XX$ ls
a01 a02 a03 a04 a05 a06 a07 a08 a09 a10 b01 b02 b03 b04 b05 b06 b07 b08 b09 b10
A01 A02 A03 A04 A05 A06 A07 A08 A09 A10 B01 B02 B03 B04 B05 B06 B07 B08 B09 B10
~/XX$ ls -d [ab]*
a01 a02 a03 a04 a05 a06 a07 a08 a09 a10 b01 b02 b03 b04 b05 b06 b07 b08 b09 b10
~/XX$ ls -d [a-b]*
a01 a02 a03 a04 a05 a06 a07 a08 a09 a10 b01 b03 b05 b07 b09
A01 A02 A03 A04 A05 A06 A07 A08 A09 A10 b02 b04 b06 b08 b10
if we use a range with an en_US.UTF-8 locals the file and directories starting with A or B are taken into account.
Curly Brackets Wildcard "{}"
Even "{}" is not really a wildcard, it is used in more or less in the same context.
Curly brackets "{}" let you define a list of values.
Brace expansion using ranges is written giving the startpoint and the endpoint of the range.
This is a "sequence expression". The sequences can be of two types:
- integers (optionally zero padded, optionally with a given increment)
- characters
$ echo {5..12}
5 6 7 8 9 10 11 12
$ echo {c..k}
c d e f g h i j k
When you mix these both types, brace expansion is not performed:
When you zeropad one of the numbers (or both) in a range, then the generated range is zeropadded, too:
$ echo {01..10}
01 02 03 04 05 06 07 08 09 10
Similar to the expansion using stringlists, you can add preamble and postscript strings:
$ echo 1.{0..9}
1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9
$ echo ---{A..E}---
---A--- ---B--- ---C--- ---D--- ---E---
$ echo {A..Z}{0..9}
A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 C0 C1 C2 C3 C4 C5 C6
C7 C8 C9 D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 E0 E1 E2 E3 E4 E5 E6 E7 E8 E9 F0 F1 F2 F3
F4 F5 F6 F7 F8 F9 G0 G1 G2 G3 G4 G5 G6 G7 G8 G9 H0 H1 H2 H3 H4 H5 H6 H7 H8 H9 I0
I1 I2 I3 I4 I5 I6 I7 I8 I9 J0 J1 J2 J3 J4 J5 J6 J7 J8 J9 K0 K1 K2 K3 K4 K5 K6 K7
K8 K9 L0 L1 L2 L3 L4 L5 L6 L7 L8 L9 M0 M1 M2 M3 M4 M5 M6 M7 M8 M9 N0 N1 N2 N3 N4
N5 N6 N7 N8 N9 O0 O1 O2 O3 O4 O5 O6 O7 O8 O9 P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 Q0 Q1
Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 S0 S1 S2 S3 S4 S5 S6 S7 S8
S9 T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 U0 U1 U2 U3 U4 U5 U6 U7 U8 U9 V0 V1 V2 V3 V4 V5
V6 V7 V8 V9 W0 W1 W2 W3 W4 W5 W6 W7 W8 W9 X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 Y0 Y1 Y2
Y3 Y4 Y5 Y6 Y7 Y8 Y9 Z0 Z1 Z2 Z3 Z4 Z5 Z6 Z7 Z8 Z9
$ echo {{A..Z},{a..z}}
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z
Common use and examples
Massdownload from the Web
In this example, wget is used to download documentation that is split over several numbered webpages.
wget won't see your braces. It will see 6 different URLs to download.:
wget http://docs.example.com/documentation/slides_part{1,2,3,4,5,6}.html
Of course it's possible, and even easier, to do that with a sequence:
wget http://docs.example.com/documentation/slides_part{1..6}.html
Generate a subdirectory structure
Your life is hard? Let's ease it a bit - that's what shells are here for.:
mkdir /home/bash/test/{foo,bar,baz,cat,dog}
(to go futher about brace expansion)
Escaping characters
Sometimes you need to tell the shell that it should ignore some characters that it normally interpreted as a special character or as a wild card.
example
- your filename contains a space (normaly interpreted as the command line separator)
- your filename contains a * (asterisk), eg how to remove the file named "*" can you issue
rm *
? (note don't try that)
- there is various solutions to escape special characters.
- use the character \ (backslash) to protect the spurious character.
- enclose file name in ' (single-quotes) or " (double-quotes)
Warning
backslashes, single-quotes and double-quotes present different levels of protection
+----------------------+---------------------------------------------------------------+
| Symbol | Protection |
+======================+===============================================================+
| \ (backslash) | protect only the character following the backslash |
+----------------------+---------------------------------------------------------------+
| ' (single-quote) | all special character inbetween is ignored |
+----------------------+---------------------------------------------------------------+
| " (double-quote) | all special character inbetween is ignored |
| | except $ and \ and ' (single-quote) |
+----------------------+---------------------------------------------------------------+
Exercises: