File Reader Activity

Reads target file as a view. Optimal for loading files from HDFS, FTP and Local file systems.

Parameters

Name

Description

Type

Default

Title

Title of the activity. Its displayed on the designer.

text

Path

File path. See below for supported file sources.

text

Format

File format. See below for supported file formats.

enum

Schema

Optional schema definition in Avro schema format.

text

As

Name of DataRow view linked to the file.

text

Options

Additional options (e.g. delimiter for CSV file could be specified as [delimiter,\t] if file is tab delimited

key-value

Supported File Sources

Name

Path Template

Local File System

file://<path>

HDFS

hdfs://<namenodehost>/<path>

FTP

ftp://<username>:<password>@<host>/<path>

Supported File Formats

Name

Description

Parquet

Apache Parquet is a columnar storage format.

ORC

ORC is also columnar storage format.

Avro

Apache Avro is a data serialization format.

CSV

CSV is a delimited text file format.

JSON

JSON is a lightweight data-interchange format.