# Overview

  • Recurring file feeds are files dropped to a Simon owned S3/SFTP bucket on a frequent cadence. These files can be incrementally loaded or overwritten.

  • If you have a file feed that you’d like to load using the Recurring File Feeds product, please ensure that it meets the requirements outlined here: [File ingestion via S3 or SFTP](🔗)

## Set up a new Recurring File Feed

  1. From the left navigation, expand **Datasets** then click **Datasets**.

  2. Click **Create Dataset**.

  3. Choose **File Feed** then click **Next**.

218




  1. The **Create a Dataset** screen appears. Complete the following:

628




(Script tags will be stripped)




Names must be in SNAKE_CASE

If you don't use SNAKE_CASE when naming your dataset, you'll see this error:

629




  1. Click **Start**. The Editor tab opens. There are two primary components:

    • The Directory Structure (on the left): the directory of the selected file source when the dataset was created (previous page). You can navigate down to the file name.

    • File Configuration (on the right): fields required for validating the File Feed display here. See below for descriptions

2380


Editor tab

All fields are required in order to proceed to the next step:

  • **File Path** - what you place in this field depends on if your files will be [incremental or overwrite](🔗).

    • If incremental, you will need to insert the file path down to the last folder directory into the file path. ie: `file_path/example`

    • If overwrite, you will need to insert the file path down to the file name. ie: `file_path/example/feed1.csv`

  • **Replication Method** - select if your dataset is incremental or overwrite

Common validation errors and how to fix them

If you do not have the correct file path and correct replication method, an error will display. See [below](🔗) for details.

  • File Format - select whether your files are .csv, .tsv, or .json

  • Compression - select whether or not your files are zipped or not

  • Encryption - select whether or not your files are PGP or GPG encrypted

Zip then Encrypt!

If you encrypt and then zip your files, Simon will not be able to process them. Please make sure you are zipping your files and then encrypting via PGP or GPG.

  1. Click **Validate**. A sample of your file appears.

  2. Mark the **data types** for each column in the Fields tab.

  3. In the settings tab, there are four configuration options to choose from:

Configuration OptionDescription
Record IDRequired for incremental feeds An identifier that is guaranteed to be unique per record. Any duplicates or null IDs are filtered out
Updated TimestampRequired for incremental feeds When the record was created Must be in [epoch time](🔗)
SkippableSee [Common terms](🔗)
Cadence* The cadence is when Simon can expect to receive the file drop. We use this time to also notify you as well in the case of a missing feed. See also [Common terms](🔗)

## Common terms

TermDefinition
CadenceRefers to the time at which Simon can expect a file drop so we can send a notification to the client if it is missing or on-time.
IncrementalOnly contains net-new data. The data in the new files will be appended to the existing data in the dataset.
OverwriteContains the data you want in the downstream dataset (Snowflake). Files will be replaced with the new file on each drop and the dataset will only have data from the latest overwrite file.
SkippableDoes not impact your customer pipe. When the toggle is set to off Simon considers this to be a critical feed that holds up the customer pipe refresh. When the toggle is set to true the customer pipe continues as scheduled.

## Common Validation Errors

  • _A file feed dataset already exists with this file path. Multiple file feed datasets cannot ingest files from the same file path_.

    You can't have multiple file feed datasets that run on the same file path. Choose a new file path.

  • _Provided file is either a directory or a file without extension. Overwrite feeds must have a path that leads to a file with an extension._

    The file path drills down to a folder, not a file name with extension. This should either be an incremental feed or you must adjust the file path to drill down to a file name with extension.






# Give a Access to a Third Party Vendor

You account manager can help you provide your vendor(s) access to a part of your org’s bucket. We need:

  • The vendor name (we'll use this as their `username`).

  • SSH public key

    • Ask your vendor to generate an SSH key value pair (by running the `ssh-keygen -P "" -f key_name` command in the terminal) and provide you with a public key (`key_name.pub` if they ran the command above).

Public Keys Must Be in the open SSH RSA format

We only accept public keys in the openSSH RSA format (key should start with `ssh-rsa`), because that is the only format that Amazon Web Services (AWS) accepts. If your RSA key is in a different format (i.e. SSH2), please [convert it to openSSH](🔗).