Once the data feed file is saved on your desktop computer, you will need to determine if it is in a compressed format. Many feeds are uncompressed, and come in a file with a .TXT or CSV extension.
With compressed feeds, the file extension is .ZIP, or .GZIP. GZIP files can be decompressed with a program like WinZip or natively in Windows 7 or 8 (this capability is built-in with Windows Explorer). Once you have established the file is no longer compressed, it is ready further formatting.
NOTE: "comma" delimited or files with a ".csv" extension have a greater rate of errors. Given a choice between types of delimited files, choose pipe or tab delimited formats, using comma delimited as the last choice. The choice of formats may be available, when you initially select a feed from a merchant or network. By the same reasoning, it is a good idea not to save your cleaned up feed in the "comma" delimited format. The comma delimited format can be converted to other delimited formats, with MS Excel or a good text editor such as TextPad, by using the search and replace feature in these applications.
The next step is to open the file in a program like Microsoft Excel. There are similar applications, but Excel is used in our example. When opening any text file (or any file that isn't a bona fide Excel file with a "XLXS" extension), by selecting All Files (*.*) in the Excel dialog box. Once the data feed file is selected, a new dialog box will appear, with the Text Import Wizard asking how Excel wants to handle the file. Select the radio button marked Delimited, then click Next.
You must know how your file is delimited. This information should be known going forward, and can check the appropriate box in the Text Import Wizard, for example “Tab” (Excel's default delimiter). If the data is Pipe delimited, all one has to do is type in the Pipe character in the “Other” box, or select it. Once the selection is completed, there is an immediate change in the way the data looks in the Data Preview window. If the correct delimiter has been selected, the data will appear organized with neat vertical lines outlining the columns.
Usually, the text qualifier can be left at the default – double quotes. Leave the Treat consecutive delimiters as one checkbox un-checked. At this point the file and data are ready for the next step. There is no reason to click Next, simply click Finish. The data feed should now load into Excel.
Unfortunately, many merchants don’t provide perfect data. Many feeds contain illegal characters, or other things that may cause WebMerge to either stop or perform erratically. There are a couple ways to correct this situation. One can clean them up in a program like Excel, a good text editor or work with them in WebMerge.
One issue to be aware of is the creation of file names or directories based on information in the data feed. Interestingly, when building file names based on categories in the data feed, and there are forward slashes / in category name fields, WebMerge will likely read the forward slash as an instruction to create a directory. WebMerge acts this way, because it thinks there might be a Path there (as in path to a directory or URL). This is actually an advantage if used correctly, as any directory structure required, can be created with forward-slash character in the data. However, the slashes will need to be removed if they are not wanted in your directory structure.
The solution is to perform a Search and Replace. It is advantageous to get very familiar with this functionality in Excel in order to create a really effective data feed file. It also takes a little imagination as to the consequences of replacing characters in the data. For example, when replacing the slashes in the Main_Category field names, another character needs to be substituted or the forward-slashes removed completely.
If the slashes are present in the data to separate a group of numbers, then replacing the slashes with a dash “-“ can be used. The replacement character does not have to be a dash, it can be anything that will work with your scheme. For example, a space or other character can be inserted. See the example below where the existing slash “/” is replaced by a dash.
Original Field (Before Search and Replace):
Cleaner 1/5/10 gallon sizes
New Field (After Search and Replace):
Cleaner 1-5-10 gallon sizes
*WebMerge will make an index filename that looks like this:
In order to successfully complete the search and replace operation, it will be necessary to highlight the entire column by clicking on the appropriate column letter at the top of your spreadsheet, and have the search a replace function fix the entire column at one time. In Excel 2003 and later, the Find & Select function can be found on the Home tab on the ribbon, or by pressing CTRL+H on your keyboard.
* The current version of WebMerge allows the user to make filenames with either underscores “_” or dashes “-“ in place of spaces in the proposed filename. Underscores were used in this example.
TIP:What if you made a mistake? It is always a good idea to check the results right away, if there was a mistake, pressing CTRL+Z will undo the entire Search and Replace operation you just preformed. The CTRL+Z undo will work for all major operations in excel, and can undo the entire operation (not simply one cell at a time).
TIP:Be careful when selecting data. All of a column must be selected in order to avoid mixing fields between records.
Text to Columns:
If the merchant that created the data feed has a field name (Column header in Excel) called Category, but the display reveals there is actually more than one category (or sub-categories) in each column separated by slashes, then it may be a good idea to separate the categories, and make new Field names (columns) based on these new splits. This is an example of category data prior to splitting:
Original Field called "Category" (Before Text to Column Operation):
Cleaning & Sanitation/Cleaning Chemicals/Lubricants/Wd40
One may ask why the merchant who created the data did not make different field names (column headers) for each category and sub-category and separate them? Regretfully, expanding the number of sub-categories is usually the job of the affiliate. Some feeds have different categories already separated.
New columns need to be created for Field names used. If this is not accomplished during the Text to Column Operation, Excel will ask an important question. “Do you want to replace the contents of the destination cells?” It is important to click Cancel. If you do not click Cancel, Excel replaces the contents of these columns and the information in adjoining columns is deleted! What Excel is trying to indicate, is that there is informationin the columns to the right or left of the column you are trying to break apart (split). Excel is really saying that the spreadsheet needs more room.
TIP:Not all versions of Excel may exhibit this behavior.
Click the letter of the column adjacent and to the right of the Category column that will be separated. Use the Insert menu item, and click on Columns. Excel makes a new empty column just to the right of your original Category column. This step should be repeated several times, depending on how many sub-categories exist in the original Category column in the original data feed file. We have rarely encountered more than six, but there may be feeds that have more sub-categories.
Once the new columns have been created, highlight the Category column. Next, go to the Data tab then to the Data Tools area on the Excel ribbon, and click on Text to columns… The “Convert Text to Columns Wizard” is displayed, asking if the information is Delimited. Click this radio button and then click Next. The arrangement of data should all look very familiar, with check boxes for different delimiters (Similar to the appearance when initially opening the file in Excel). The delimiter may already be checked, in case it is not, type the character in, or choose one of the defaults that match the type used normally.
In our example, there are slashes – type or select. The information should appear immediately, separated for you in the Data preview window. Click Finish to make the conversion. Important, if the following is displayed, “Do you want to replace the contents of the destination cells?” then there were not enough blank columns created to the right of the existing column for Excel to put the separated data in. Again, to avoid overwriting the data, click Cancel. More columns need to be created at this time.
Finally, several new columns populated with sub-category information should be displayed. Field names need to be assigned to the columns, so that WebMerge will have some way to recognize the data properly. Some examples of sub-category names to be typed-in for the new columns include sub-category1 and sub-category2 or logical progressions. These names can be anything you choose; however it is always a good idea to be consistent.
Before Text to Columns Operation as displayed in Excel:
After a successful Text to Columns Operation (with field names (column headers) added)):
Fig. 2 - Category listing after separating data into multiple columns
Before finishing cleaning up the feed, determine if all of the fields created are actually needed. If some are not required, think about removing them by clicking on the letter for the column you intend to remove, and click on Delete in the Edit or right-click context menu. This action will result in a smaller file size for the data feed, and may speed up the processing and creation of files.
After completing the templates, compare the fields you have chosen to use in the data, and the WM-Tags written into the templates. If there are other fields in the feed that are not used, it will be more efficient to delete them.
TIP: You may also want to sort your data in order to view it in a more organized fashion. For example, the data can sorted by the category intend for use in building index pages. When sorting in Excel, be sure to highlight everything except the Field names (column headers) at the top. Mixing data among different records is disastrous, causing one to start from the beginning.
Revision 4 (2014)
NEXT SECTION - Preparing your Templates