inspirehep.modules.workflows package

Subpackages

Submodules

inspirehep.modules.workflows.bundles module

Bundles for forms used across INSPIRE.

inspirehep.modules.workflows.config module

Workflows configuration.

inspirehep.modules.workflows.config.WORKFLOWS_PLOTEXTRACT_TIMEOUT = 300

Time in seconds a plotextract task is allowed to run before it is killed.

inspirehep.modules.workflows.config.WORKFLOWS_REFEXTRACT_TIMEOUT = 600

Time in seconds a refextract task is allowed to run before it is killed.

inspirehep.modules.workflows.errors module

exception inspirehep.modules.workflows.errors.CallbackError[source]

Bases: invenio_workflows.errors.WorkflowsError

Callback exception.

code = 400
error_code = 'CALLBACK_ERROR'
errors = None
message = 'Workflow callback error.'
to_dict()[source]

Execption to dictionary.

workflow = None
exception inspirehep.modules.workflows.errors.CallbackMalformedError(errors=None, **kwargs)[source]

Bases: inspirehep.modules.workflows.errors.CallbackError

Malformed request exception.

error_code = 'MALFORMED'
message = 'The workflow request is malformed.'
exception inspirehep.modules.workflows.errors.CallbackRecordNotFoundError(recid, **kwargs)[source]

Bases: inspirehep.modules.workflows.errors.CallbackError

Record not found exception.

code = 404
error_code = 'RECORD_NOT_FOUND'
exception inspirehep.modules.workflows.errors.CallbackValidationError(workflow_data, **kwargs)[source]

Bases: inspirehep.modules.workflows.errors.CallbackError

Validation error exception.

error_code = 'VALIDATION_ERROR'
message = 'Validation error.'
exception inspirehep.modules.workflows.errors.CallbackWorkflowNotFoundError(workflow_id, **kwargs)[source]

Bases: inspirehep.modules.workflows.errors.CallbackError

Workflow not found exception.

code = 404
error_code = 'WORKFLOW_NOT_FOUND'
exception inspirehep.modules.workflows.errors.CallbackWorkflowNotInMergeState(workflow_id, **kwargs)[source]

Bases: inspirehep.modules.workflows.errors.CallbackError

Workflow not in validation error exception.

error_code = 'WORKFLOW_NOT_IN_MERGE_STATE'
exception inspirehep.modules.workflows.errors.CallbackWorkflowNotInValidationError(workflow_id, **kwargs)[source]

Bases: inspirehep.modules.workflows.errors.CallbackError

Validation workflow not in validation error exception.

error_code = 'WORKFLOW_NOT_IN_ERROR_STATE'
exception inspirehep.modules.workflows.errors.CallbackWorkflowNotInWaitingEditState(workflow_id, **kwargs)[source]

Bases: inspirehep.modules.workflows.errors.CallbackError

Workflow not in validation error exception.

error_code = 'WORKFLOW_NOT_IN_WAITING_FOR_CURATION_STATE'
exception inspirehep.modules.workflows.errors.DownloadError[source]

Bases: invenio_workflows.errors.WorkflowsError

Error representing a failed download in a workflow.

exception inspirehep.modules.workflows.errors.MergeError[source]

Bases: invenio_workflows.errors.WorkflowsError

Error representing a failed merge in a workflow.

inspirehep.modules.workflows.ext module

Workflows extension.

class inspirehep.modules.workflows.ext.InspireWorkflows(app=None)[source]

Bases: object

init_app(app)[source]
init_config(app)[source]

inspirehep.modules.workflows.loaders module

Workflows loader.

inspirehep.modules.workflows.loaders.marshmallow_loader(schema_class, partial=False)[source]

Marshmallow loader.

inspirehep.modules.workflows.loaders.workflow_loader()

inspirehep.modules.workflows.models module

Extra models for workflows.

class inspirehep.modules.workflows.models.Timestamp[source]

Bases: object

Timestamp model mix-in with fractional seconds support. SQLAlchemy-Utils timestamp model does not have support for fractional seconds.

created = Column(None, DateTime(), table=None, default=ColumnDefault(<function utcnow>))
updated = Column(None, DateTime(), table=None, default=ColumnDefault(<function utcnow>))
class inspirehep.modules.workflows.models.WorkflowsAudit(**kwargs)[source]

Bases: sqlalchemy.ext.declarative.api.Model

action
created
decision
id
object_id
save()[source]

Save object to persistent storage.

score
source
user_action
user_id
class inspirehep.modules.workflows.models.WorkflowsPendingRecord(**kwargs)[source]

Bases: sqlalchemy.ext.declarative.api.Model

record_id
workflow_id
class inspirehep.modules.workflows.models.WorkflowsRecordSources(**kwargs)[source]

Bases: sqlalchemy.ext.declarative.api.Model, inspirehep.modules.workflows.models.Timestamp

created
json
record_uuid
source
updated
inspirehep.modules.workflows.models.timestamp_before_update(mapper, connection, target)[source]

Update updated property with current time on before_update event.

inspirehep.modules.workflows.proxies module

Extra models for workflows.

inspirehep.modules.workflows.proxies.load_antikeywords(*args, **kwds)[source]

Loads list of antihep keywords with cached gotcha.

inspirehep.modules.workflows.search module

Search factory for INSPIRE workflows UI.

We specify in this custom search factory which fields elasticsearch should return in order to not always return the entire record.

Add a key path to the includes variable to include it in the API output when listing/searching across workflow objects (Holding Pen).

inspirehep.modules.workflows.search.holdingpen_search_factory(self, search, **kwargs)[source]

Override search factory.

inspirehep.modules.workflows.views module

Callback blueprint for interaction with legacy.

class inspirehep.modules.workflows.views.ResolveEditArticleResource[source]

Bases: flask.views.MethodView

Resolve edit_article callback.

When the workflow needs to resolve conficts, the workflow stops in HALTED state, to continue this endpoint is called. If it’s called and the conflicts are not resolved it will just save the workflow.

Parameters:workflow_data (dict) – the workflow object send in the request’s payload.
methods = ['PUT']
put()[source]

Handle callback for merge conflicts.

class inspirehep.modules.workflows.views.ResolveMergeResource[source]

Bases: flask.views.MethodView

Resolve merge callback.

When the workflow needs to resolve conficts, the workflow stops in HALTED state, to continue this endpoint is called. If it’s called and the conflicts are not resolved it will just save the workflow.

Parameters:workflow_data (dict) – the workflow object send in the request’s payload.
methods = ['PUT']
put()[source]

Handle callback for merge conflicts.

class inspirehep.modules.workflows.views.ResolveValidationResource[source]

Bases: flask.views.MethodView

Resolve validation error callback.

methods = ['PUT']
put()[source]

Handle callback from validation errors.

When validation errors occur, the workflow stops in ERROR state, to continue this endpoint is called.

Parameters:workflow_data (dict) – the workflow object send in the request’s payload.

Examples

An example of successful call:

$ curl

http://web:5000/callback/workflows/resolve_validation_errors -H “Host: localhost:5000” -H “Content-Type: application/json” -d ‘{

“_extra_data”: {
... extra data content

}, “id”: 910648, “metadata”: {

“$schema”: “https://labs.inspirehep.net/schemas/records/hep.json”, ... record content

}

}’

The response:

HTTP 200 OK

{“mesage”: “Workflow 910648 validated, continuing it.”}

A failed example:

$ curl

http://web:5000/callback/workflows/resolve_validation_errors -H “Host: localhost:5000” -H “Content-Type: application/json” -d ‘{

“_extra_data”: {
... extra data content

}, “id”: 910648, “metadata”: {

“$schema”: “https://labs.inspirehep.net/schemas/records/hep.json”, ... record content

}

}’

The error response will contain the workflow that was passed, with the new validation errors:

HTTP 400 Bad request

{
“_extra_data”: {
“validatior_errors”: [
{
“path”: [“path”, “to”, “error”], “message”: “required: [‘missing_key1’, ‘missing_key2’]”

}

], ... rest of extra data content

}, “id”: 910648, “metadata”: {

“$schema”: “https://labs.inspirehep.net/schemas/records/hep.json”, ... record content

}

}

inspirehep.modules.workflows.views.callback_resolve_edit_article(*args, **kwargs)

Resolve edit_article callback.

When the workflow needs to resolve conficts, the workflow stops in HALTED state, to continue this endpoint is called. If it’s called and the conflicts are not resolved it will just save the workflow.

Parameters:workflow_data (dict) – the workflow object send in the request’s payload.
inspirehep.modules.workflows.views.callback_resolve_merge_conflicts(*args, **kwargs)

Resolve merge callback.

When the workflow needs to resolve conficts, the workflow stops in HALTED state, to continue this endpoint is called. If it’s called and the conflicts are not resolved it will just save the workflow.

Parameters:workflow_data (dict) – the workflow object send in the request’s payload.
inspirehep.modules.workflows.views.callback_resolve_validation(*args, **kwargs)

Resolve validation error callback.

inspirehep.modules.workflows.views.error_handler(error)[source]

Callback error handler.

inspirehep.modules.workflows.views.inspect_merge(holdingpen_id)[source]
inspirehep.modules.workflows.views.robotupload_callback()[source]

Handle callback from robotupload.

If robotupload was successful caches the workflow object id that corresponds to the uploaded record, so the workflow can be resumed when webcoll finish processing that record. If robotupload encountered an error sends an email to site administrator informing him about the error.

Examples

An example of failed callback that did not get to create a recid (the “nonce” is the workflow id):

$ curl \
    http://web:5000/callback/workflows/robotupload \
    -H "Host: localhost:5000" \
    -H "Content-Type: application/json" \
    -d '{
        "nonce": 1,
        "results": [
            {
                "recid":-1,
                "error_message": "Record already exists",
                "success": false
            }
        ]
    }'

One that created the recid, but failed later:

$ curl \
    http://web:5000/callback/workflows/robotupload \
    -H "Host: localhost:5000" \
    -H "Content-Type: application/json" \
    -d '{
        "nonce": 1,
        "results": [
            {
                "recid":1234,
                "error_message": "Unable to parse pdf.",
                "success": false
            }
        ]
    }'

A successful one:

$ curl \
    http://web:5000/callback/workflows/robotupload \
    -H "Host: localhost:5000" \
    -H "Content-Type: application/json" \
    -d '{
        "nonce": 1,
        "results": [
            {
                "recid":1234,
                "error_message": "",
                "success": true
            }
        ]
    }'
inspirehep.modules.workflows.views.start_edit_article_workflow(recid)[source]
inspirehep.modules.workflows.views.webcoll_callback()[source]

Handle a callback from webcoll with the record ids processed.

Expects the request data to contain a list of record ids in the recids field.

Example

An example of callback:

$ curl \
    http://web:5000/callback/workflows/webcoll \
    -H "Host: localhost:5000" \
    -F 'recids=1234'

Module contents

Workflows module.