OK, here's another go then.
Think of the delay line as a train with many carriages. The audio samples come in at the locomotive end, and exit at the last carriage. Each clock tick, they all move one carriage towards the rear, one sample in each carriage. So our train holds a recent history of the audio - typically for an FIR filter in audio hardware, the train could have up to 150 carriages.
An output sample (done for each clock tick) is calculated by the ticket inspector running down the whole length of the train, noting the value of each audio sample present in each carriage and putting that down in his notebook. The notebook also contains a list of coefficients - each carriage is associated with a fixed number, called a 'coefficient'. The output sample is the sum of all these (coefficients X audio sample).
Now in the case of jkeny's files, they were created by using a notebook with only two numbers in - 1 and 0.0112. 1 applied to (I think) the 21st carriage back and 0.0112 applied to the first carriage. So to generate the output samples, the ticket inspector only had to multiply together two pairs of numbers, and in one case that multiplication was by unity. In a general case, an FIR filter will have more than 2 coefficients - the sequence of them typically looks like the sin(x)/x wiggles seen on impulse responses of DACs etc shown in Stereophile reviews.
Any better?