Data Ingestion
As companies grow, so do their data infrastructure. File storage needs become increasingly complex and difficult to maintain. Eventually, it is decided that central file repositories and strict folder naming conventions are needed. Without an appropriate software solution, getting all of the data well organized into a central location is a huge challenge. TACTIC simplifies the data ingestion process by making it as easy as dragging and dropping files into an ingestion tool. From there, TACTIC automatically places the assets where they need to go by enforcing a company’s predefined file structure. Rules can be applied to the data to notify key individuals when files are ingested.
What is data ingestion?
Data ingestion is the process by which an already existing file system is intelligently “ingested” or brought into TACTIC. During the ingestion process, keywords are extracted from the file paths based on rules established for the project. In addition, metadata or other defining information about the file or folder being ingested can be applied on ingest. This allows users to uniquely identify each file or folder that is being ingested into TACTIC. This, by extension, means that this metadata would help to find this particular file or folder more effectively.
But, why ingest a file system?
Ingestion of files and folders into TACTIC gives a user the ability to restructure and organize the entire file system according to a hierarchy defined in TACTIC. With an improved structure, searching for and finding files and folders becomes an easier task to undertake. All files and folders when ingested are recorded as entries in tables in TACTIC. Each record corresponds to a specific file or folder in the file system. With all of the metadata and keywords defined on each of those entries during large data ingestion, simple keyword searches can help to locate files and folders with greater efficiency and speed than using an explorer search or trying to manually search for the files or folders in the old file structure.
If there is a preference to search for files in a folder structured format, TACTIC also has the ability to display them in a folder view which shows the hierarchy of the newly structured file system constructed on data ingestion. Even in this case, with this reorganization, finding files and saving new ones can be performed with greater ease. This is a very attractive solution because the entire file system is stored and managed through one single system: TACTIC.
The Ingest Manager
The Ingest Manager is a proprietary tool that provides the ability to perform large data ingestions of files and folders into TACTIC, all from a single interface. During large data ingestion, keywords are intelligently extracted from the folder structure and metadata is applied to the files and folders according to specific rules set in the interface.
To select which files and folders to ingest, the Ingest Manager is configured to point to the directory it should be looking at to ingest files and folders from. The entire folder structure in that directory is then visually replicated and displayed in the interface. The user then has the ability to traverse through all of the subfolders in that directory. Selecting any of the subfolders dynamically generates statistics of the number of files and folders within, the size of the selected subfolder, and a list of keywords and file extensions.
The Data Ingestion Process
Selecting a folder or file from the folder hierarchy displayed in the Ingest Manager interface begins the data ingestion process. A presentation of statistics and various ingest options are displayed for the folder. The options provide flexibility to control what happens to the folder and its contents on ingestion into TACTIC. Various options include selecting:
- Where to ingest the files and folders to
- How to ingest the files and folders
- What folders or contents of the folders are to be ingested
- What metadata should be applied to the files and folders on data ingestion
Automated features of the Ingest Manager like keyword and file extension generation limit the need for extensive user interaction and the possibility of user error, while giving flexibility to the user to select the keywords to apply to the ingested files and folders. In addition, the user can also select which files to ingest based on the file extension. Both features use a complex algorithm to extract keywords and file extensions from the folder structure. This ensures that a complete and comprehensive list of keywords and file extensions are presented to the user.
Testing and queue features offer full control of the data ingestion process to the user by enabling the user to see how the folders and files will be ingested before submission and monitor the progress of each ingestion after submission. After entering all pertinent data and selecting data ingestion options, clicking the “Test” button opens a popup that shows the user how the folder and file entries will look after ingestion. It gives the user the opportunity to correct any mistakes or modify any entered data.
Upon submission, the ingestion of the folder or file is recorded as an ingestion job in a queue. The queue tracks each ingestion job submitted and processes each job in order of submission. Statuses are assigned to each job to indicate whether the job is still pending, in progress or complete. If a job fails, an error status is assigned and the user can assess the issue based on the error message that gets logged in the queue. Once the job is complete, the ingested folders and files are available to and searchable by the user in TACTIC.
Southpaw’s TACTIC is a well-established Digital Asset Management (DAM) platform. Using TACTIC as the backbone, Southpaw has developed tools to manage and synchronize file systems on multiple servers worldwide, creating a scalable file system management solution. Southpaw’s “Ingest Manager” and “aSync” technologies are tools that have the stability and versatility to ingest files and folders of any number, size and type into the TACTIC system and then “aSynchronize” these files and folders on different servers and geographical locations.