inspirehep.modules.workflows.utils package¶
Module contents¶
Workflows utils.
-
inspirehep.modules.workflows.utils.
convert
(xml, xslt_filename)[source]¶ Convert XML using given XSLT stylesheet.
-
inspirehep.modules.workflows.utils.
do_not_repeat
(step_id)[source]¶ Decorator used to skip workflow steps when a workflow is re-run.
Will store the result of running the workflow step in source_data.persistent_data after running the first time, and skip the step on the following runs, also applying previously recorded ‘changes’ to extra_data.
The decorated function has to conform to the following signature:
def decorated_step(obj: WorkflowObject, eng: WorkflowEngine) -> Dict[str, Any]: ...Where obj and eng are usual arguments following the protocol of all workflow steps. The returned value of the decorated_step will be used as a patch to be applied on the workflow object’s source data (which ‘replays’ changes made by the workflow step).
Parameters: step_id (str) – name of the workflow step, to be used as key in persistent_data Returns: the decorator Return type: callable
-
inspirehep.modules.workflows.utils.
download_file_to_workflow
(*args, **kwargs)[source]¶ Download a file to a specified workflow.
The
workflow.files
property is actually a method, which returns aWorkflowFilesIterator
. This class inherits a custom__setitem__
method from its parent,FilesIterator
, which ends up callingsave
on aninvenio_files_rest.storage.pyfs.PyFSFileStorage
instance throughObjectVersion
andFileObject
. This method consumes the stream passed to it and saves in its place aFileObject
with the details of the downloaded file.Consuming the stream might raise a
ProtocolError
because the server might terminate the connection before sending any data. In this case we retry 5 times with exponential backoff before giving up.
-
inspirehep.modules.workflows.utils.
get_document_in_workflow
(*args, **kwds)[source]¶ Context manager giving the path to the document attached to a workflow object.
- Arg:
- obj: workflow object
Returns: The path to a local copy of the document. If no documents are present, it retuns None. If several documents are present, it prioritizes the fulltext. If several documents with the same priority are present, it takes the first one and logs an error. Return type: Optional[str]
-
inspirehep.modules.workflows.utils.
get_resolve_edit_article_callback_url
()[source]¶ Resolve edit_article workflow letting it continue.
Note
It’s using
inspire_workflows.callback_resolve_edit_article
route.
-
inspirehep.modules.workflows.utils.
get_resolve_merge_conflicts_callback_url
()[source]¶ Resolve validation callback.
Returns the callback url for resolving the merge conflicts.
Note
It’s using
inspire_workflows.callback_resolve_merge_conflicts
route.
-
inspirehep.modules.workflows.utils.
get_resolve_validation_callback_url
()[source]¶ Resolve validation callback.
Returns the callback url for resolving the validation errors.
Note
It’s using
inspire_workflows.callback_resolve_validation
route.
-
inspirehep.modules.workflows.utils.
get_source_for_root
(source)[source]¶ Source for the root workflow object.
Parameters: source (str) – the record source. Returns: the source for the root workflow object. Return type: (str) Note
For the time being any workflow with
acquisition_source.source
different thanarxiv
andsubmitter
will be stored aspublisher
.
-
inspirehep.modules.workflows.utils.
get_validation_errors
(data, schema)[source]¶ Creates a
validation_errors
dictionary.Parameters: Returns: validation_errors
formatted dict.Return type:
-
inspirehep.modules.workflows.utils.
ignore_timeout_error
(return_value=None)[source]¶ Ignore the TimeoutError, returning return_value when it happens.
Quick fix for
refextract
andplotextract
tasks only. It shouldn’t be used for others!
-
inspirehep.modules.workflows.utils.
insert_wf_record_source
(json, record_uuid, source)[source]¶ Stores a record in the WorkflowRecordSource table in the db.
Parameters:
-
inspirehep.modules.workflows.utils.
json_api_request
(*args, **kwargs)[source]¶ Make JSON API request and return JSON response.
-
inspirehep.modules.workflows.utils.
log_workflows_action
(action, relevance_prediction, object_id, user_id, source, user_action='')[source]¶ Log the action taken by user compared to a prediction.
-
inspirehep.modules.workflows.utils.
read_all_wf_record_sources
(record_uuid)[source]¶ Retrieve all
WorkflowRecordSource
for a given record id.Parameters: record_uuid (uuid) – the uuid of the record Returns: the WorkflowRecordSource``s related to ``record_uuid
Return type: (list)
-
inspirehep.modules.workflows.utils.
read_wf_record_source
(record_uuid, source)[source]¶ Retrieve a record from the
WorkflowRecordSource
table.Parameters: Returns: the given record, if any or None
Return type: (dict)
-
inspirehep.modules.workflows.utils.
timeout_with_config
(config_key)[source]¶ Decorator to set a configurable timeout on a function.
Parameters: config_key (str) – config key with a integer value representing the time in seconds after which the decorated function will abort, raising a TimeoutError
. If the key is not present in the config, aKeyError
is raised.Note
This function is needed because it’s impossible to pass a value read from the config as an argument to a decorator, as it gets evaluated before the application context is set up.