Parsing Ftp.ListDirectoryDetails and ListDirectory
I just started a project which requires a FTP client that will download a root directory recursively. This means simply that a user can specify a single directory, and this client needs to download all the files in that directory, and traverse through all subdirectories and grab those files too. To traverse through subdirectories, you have to parse out the directory name from the ListDirectoryDetails output and build a new path for your FtpWebRequest instance. While attempting to parse the output I ran into an interesting problem. I construct a ftp request, set the method to ListDirectoryDetails, construct a response object, and read through the response.
There are two possible output formats in my scenario, UNIX style and Windows Style.
UNIX Style:
Windows Style:
As you can see the output is not delimited using a standard spacing (like tab) or character. The best way to tell if a record is a directory is not is to look at the first character in the permissions (if it is “d” its a directory). Using the far left permissions column is better than checking the record for a filename with a “.extension.” So why not do a String.Split and chop up the line into chunks? Any directory that contains a space will not be parsed correctly.
After doing some research, I came across a couple of links which talked about using regex to parse out UNIX and windows style ftp dir output. In my experience regex is expensive. How do you solve this? Why not submit two ftpwebrequests? One where method = ListDirectory so you can fetch the name of the file/directory in question (including whitespaces), and then one where method = ListDirectoryDetails so you can see which records are actually directories?
Yes, it’s chatty.. but can handle a variety of common issues, easily. Note the directory with several spaces:
Proof of concept code:
1: using System;
2: using System.Collections.Generic;
3: using System.Text;
4: using System.Net;
5: using System.IO;
6: using System.Collections;
7:
8: namespace ftpDirTest
9: {
10: class Program
11: {
12: static void Main(string[] args)
13: {
14: //Get Directory Details
15: FtpWebRequest _lsDirFtp;
16: _lsDirFtp = (FtpWebRequest)FtpWebRequest.Create(“ftp://ftp.cindercube.org/public_html/cindercube/wp-content/themes/”);
17: _lsDirFtp.Credentials = new NetworkCredential(“someuser”, “somepassword”);
18: _lsDirFtp.Method = WebRequestMethods.Ftp.ListDirectoryDetails;
19:
20: WebResponse response = _lsDirFtp.GetResponse();
21: StreamReader reader = new StreamReader(response.GetResponseStream());
22:
23: //Get Directory/File names
24: FtpWebRequest _lsDirDetailsFtp;
25: _lsDirDetailsFtp = (FtpWebRequest)FtpWebRequest.Create(“ftp://ftp.cindercube.org/public_html/cindercube/wp-content/themes/”);
26: _lsDirDetailsFtp.Credentials = new NetworkCredential(“someuser”, “somepassword”);
27: _lsDirDetailsFtp.Method = WebRequestMethods.Ftp.ListDirectory;
28:
29: WebResponse response2 = _lsDirDetailsFtp.GetResponse();
30: StreamReader reader2 = new StreamReader(response2.GetResponseStream());
31:
32: //read file/directory names into arraylist
33: string lsdirectory = reader2.ReadLine();
34: ArrayList lsnames = new ArrayList();
35: while (lsdirectory != null)
36: {
37: lsnames.Add(lsdirectory);
38: lsdirectory = reader2.ReadLine();
39: }
40:
41: //read through directory details response
42: string line = reader.ReadLine();
43: while (line != null)
44: {
45: if (line.StartsWith(“d”) && !line.EndsWith(“.”)) //”d” = dir don’t need “.” or “..” dirs
46: {
47: foreach (String chk in lsnames) //compare basic dir output to detail dir output to get dir name
48: {
49: if (line.EndsWith(chk))
50: {
51: //found dir
52: Console.WriteLine(chk);
53: }
54: }
55: }
56: line = reader.ReadLine();
57: }
58: }
59: }
60: }
I was trying to parse the data returned by ListDirectoryDetails and in case of directories I was getting
Using ListDirectoryDetails should yield a line starting with “-” or “d” I have not seen a scenario when either characters are not present. “d” == directory, “-” == file
February 2nd, 2008 at 4:00 pmListDirectoryDetails returns this for me:
04-30-08 03:04PM 40960 File Space Mine.exe
June 6th, 2008 at 2:52 amThanks a bunch. This helped get me what I needed for a directory list and a file list.
October 22nd, 2008 at 9:12 pm