File Utilities

file_utils

Utils for downloading and processing file.

cotk.file_utils.get_resource_file_path(file_id, cache_dir=None, config_dir=None)[source]

Get file_path of resource of all types

cotk.file_utils.import_local_resources(file_id, local_path, cache_dir=None, config_dir=None, ignore_exist_error=False)[source]

Import benchmark from local, if hashtag checked, save to cache.

cotk.file_utils.load_file_from_url(url, force=False, cache_dir=None)[source]

See cotk.downloader.load_file_from_url.

resources_processor

class cotk.file_utils.ResourceProcessor(cache_dir, config_dir)[source]

Base class for processor.

class cotk.file_utils.DefaultResourceProcessor(cache_dir, config_dir)[source]

Processor for default resource: do nothing.

preprocess(local_path)[source]

Preprocess after download and before save.

postprocess(local_path)[source]

Postprocess before read.

class cotk.file_utils.BaseResourceProcessor(cache_dir, config_dir)[source]

Basic processor for MSCOCO, OpenSubtitles, Ubuntu…

basepreprocess(local_path, name)[source]

Preprocess after download and before save.

postprocess(local_path)[source]

Postprocess before read.

get_temp_dir(filepath)[source]

Get a temp directory, in which some temporary files may be saved. The temp directory is a subdirectory of self.cache_dr and is named after the hash value of argument filepath, so that the same filepath has the same corresponding temp directory.

class cotk.file_utils.MSCOCOResourceProcessor(cache_dir, config_dir)[source]

Processor for MSCOCO dataset

preprocess(local_path)[source]

Preprocess after download and before save.

postprocess(local_path)[source]

Postprocess before read.

class cotk.file_utils.OpenSubtitlesResourceProcessor(cache_dir, config_dir)[source]

Processor for OpenSubtitles Dataset

preprocess(local_path)[source]

Preprocess after download and before save.

postprocess(local_path)[source]

Postprocess before read.

class cotk.file_utils.UbuntuResourceProcessor(cache_dir, config_dir)[source]

Processor for UbuntuCorpus dataset

preprocess(local_path)[source]

Preprocess after download and before save.

postprocess(local_path)[source]

Postprocess before read.

class cotk.file_utils.SwitchboardCorpusResourceProcessor(cache_dir, config_dir)[source]

Processor for SwitchboardCorpus dataset

preprocess(local_path)[source]

Preprocess after download and before save.

postprocess(local_path)[source]

Postprocess before read.

_read_file(filepath, read_multi_ref=False)[source]
Parameters
  • filepath (str) – Name of the file to read from

  • read_multi_ref (bool) – If False, add turn <d> ahead of each session If True, add turn <d> at the end of each session and read candidate responses

class cotk.file_utils.SSTResourceProcessor(cache_dir, config_dir)[source]

Processor for SST dataset

preprocess(local_path)[source]

Preprocess after download and before save.

postprocess(local_path)[source]

Postprocess before read.

class cotk.file_utils.GloveResourceProcessor(cache_dir=None, config_dir=None)[source]

Base Class for all dimension version of glove wordvector.

basepreprocess(local_path, name)[source]

Preprocess after download and before save.

basepostprocess(local_path, name)[source]

Postprocess before read.

class cotk.file_utils.Glove50dResourceProcessor(cache_dir=None, config_dir=None)[source]

Processor for glove50d wordvector

preprocess(local_path)[source]

Preprocess after download and before save.

postprocess(local_path)[source]

Postprocess before read.

class cotk.file_utils.Glove100dResourceProcessor(cache_dir=None, config_dir=None)[source]

Processor for glove100d wordvector

preprocess(local_path)[source]

Preprocess after download and before save.

postprocess(local_path)[source]

Postprocess before read.

class cotk.file_utils.Glove200dResourceProcessor(cache_dir=None, config_dir=None)[source]

Processor for glove200d wordvector

preprocess(local_path)[source]

Preprocess after download and before save.

postprocess(local_path)[source]

Postprocess before read.

class cotk.file_utils.Glove300dResourceProcessor(cache_dir=None, config_dir=None)[source]

Processor for glove300d wordvector

preprocess(local_path)[source]

Preprocess after download and before save.

postprocess(local_path)[source]

Postprocess before read.