modelzoo.transformers.data_processing.scripts.pubmed.preprocess.TextFormatting.TextFormatting#
- class modelzoo.transformers.data_processing.scripts.pubmed.preprocess.TextFormatting.TextFormatting[source]#
 Bases:
object- Parameters
 pubmed_path (str) – Path to folder containing PubMed files
:param str output_folder : Path to where the txt file to be written :param Optional[int] filesize_limit: Max size of each text file :param Optional[bool] recursive: Flag if true, searches for nxml/xml files recursively within subfolders
Methods
mergemerge_abstractsmerge_fulltext- __init__(pubmed_path, output_filename, filesize_limit=5000000000, recursive=False)[source]#
 - Parameters
 pubmed_path (str) – Path to folder containing PubMed files
:param str output_folder : Path to where the txt file to be written :param Optional[int] filesize_limit: Max size of each text file :param Optional[bool] recursive: Flag if true, searches for nxml/xml files recursively within subfolders