File Reader Activity¶
Reads target file as a view. Optimal for loading files from HDFS, FTP and Local file systems.
Parameters¶
Name |
Description |
Type |
Default |
---|---|---|---|
Title |
Title of the activity. Its displayed on the designer. |
text |
|
Path |
File path. See below for supported file sources. |
text |
|
Format |
File format. See below for supported file formats. |
enum |
|
Schema |
Optional schema definition in Avro schema format. |
text |
|
As |
Name of DataRow view linked to the file. |
text |
|
Options |
Additional options (e.g. delimiter for CSV file could be specified as [delimiter,\t] if file is tab delimited |
key-value |
Supported File Sources¶
Name |
Path Template |
---|---|
Local File System |
file://<path> |
HDFS |
hdfs://<namenodehost>/<path> |
FTP |
ftp://<username>:<password>@<host>/<path> |
Supported File Formats¶
Name |
Description |
---|---|
Parquet |
Apache Parquet is a columnar storage format. |
ORC |
ORC is also columnar storage format. |
Avro |
Apache Avro is a data serialization format. |
CSV |
CSV is a delimited text file format. |
JSON |
JSON is a lightweight data-interchange format. |