inspirehep.modules.migrator package

Submodules

inspirehep.modules.migrator.cli module

Manage migrator from INSPIRE legacy instance.

inspirehep.modules.migrator.cli.halt_if_debug_mode(force)[source]

inspirehep.modules.migrator.dumper module

Migrator dumper.

inspirehep.modules.migrator.dumper.marshmallow_dumper(schema_class)[source]

Marshmallow dumper.

inspirehep.modules.migrator.dumper.migrator_error_list_dumper(results, many=False)

inspirehep.modules.migrator.ext module

Migrator extension.

class inspirehep.modules.migrator.ext.InspireMigrator(app=None)[source]

Bases: object

init_app(app)[source]

inspirehep.modules.migrator.models module

Models for Migrator.

class inspirehep.modules.migrator.models.LegacyRecordsMirror(**kwargs)[source]

Bases: sqlalchemy.ext.declarative.api.Model

collection
error
classmethod from_marcxml(raw_record)[source]

Create an instance from a MARCXML record.

The record must have a 001 tag containing the recid, otherwise it raises a ValueError.

last_updated
marcxml

marcxml column wrapper to compress/decompress on the fly.

re_recid = <_sre.SRE_Pattern object at 0x6c95840>
recid
valid
inspirehep.modules.migrator.models.timestamp_before_update(mapper, connection, target)[source]

Update last_updated property with current time on before_update event.

inspirehep.modules.migrator.permissions module

inspirehep.modules.migrator.tasks module

Manage migration from INSPIRE legacy instance.

inspirehep.modules.migrator.tasks.chunker(iterable, chunksize=100)[source]
(task)inspirehep.modules.migrator.tasks.continuous_migration[source]

Task to continuously migrate what is pushed up by Legacy.

inspirehep.modules.migrator.tasks.disable_orcid_push(task_function)[source]

Temporarily disable ORCID push

Decorator to temporarily disable ORCID push while a given task is running, and only for that task. Takes care of restoring the previous state in case of errors or when the task is finished. This does not interfere with other tasks, firstly because of ditto, secondly because configuration is only changed within the worker’s process (thus doesn’t affect parallel tasks).

inspirehep.modules.migrator.tasks.insert_into_mirror(raw_records)[source]
inspirehep.modules.migrator.tasks.migrate_and_insert_record(raw_record, skip_files=False)[source]

Migrate a record and insert it if valid, or log otherwise.

inspirehep.modules.migrator.tasks.migrate_from_file(source, wait_for_results=False)[source]
inspirehep.modules.migrator.tasks.migrate_from_mirror(also_migrate=None, wait_for_results=False, skip_files=None)[source]

Migrate legacy records from the local mirror.

By default, only the records that have not been migrated yet are migrated.

Parameters:
  • also_migrate (Optional[string]) – if set to 'broken', also broken records will be migrated. If set to 'all', all records will be migrated.
  • skip_files (Optional[bool]) – flag indicating whether the files in the record metadata should be copied over from legacy and attach to the record. If None, the corresponding setting is read from the configuration.
  • wait_for_results (bool) – flag indicating whether the task should wait for the migration to finish (if True) or fire and forget the migration tasks (if False).
(task)inspirehep.modules.migrator.tasks.migrate_recids_from_mirror[source]
inspirehep.modules.migrator.tasks.migrate_record_from_legacy(recid)[source]
inspirehep.modules.migrator.tasks.migrate_record_from_mirror(prod_record, skip_files=False)[source]

Migrate a mirrored legacy record into an Inspire record.

Parameters:
  • prod_record (LegacyRecordsMirror) – the mirrored record to migrate.
  • skip_files (bool) – flag indicating whether the files in the record metadata should be copied over from legacy and attach to the record.
Returns:

the migrated record metadata, which is also inserted into the database.

Return type:

dict

inspirehep.modules.migrator.tasks.populate_mirror_from_file(source)[source]
inspirehep.modules.migrator.tasks.read_file(source)[source]
inspirehep.modules.migrator.tasks.split_blob(blob)[source]

Split the blob using <record.*?>.*?</record> as pattern.

inspirehep.modules.migrator.tasks.split_stream(stream)[source]

Split the stream using <record.*?>.*?</record> as pattern.

This operates line by line in order not to load the entire file in memory.

inspirehep.modules.migrator.utils module

Migrator utils.

inspirehep.modules.migrator.utils.get_collection(marc_record)[source]
inspirehep.modules.migrator.utils.get_collection_from_marcxml(marcxml)[source]

inspirehep.modules.migrator.views module

class inspirehep.modules.migrator.views.MigratorErrorListResource[source]

Bases: flask.views.MethodView

Return a list of errors belonging to invalid mirror records.

decorators = [<flask_principal.IdentityContext object>]
get()[source]
methods = ['GET']
inspirehep.modules.migrator.views.migrator_error_list_resource(*args, **kw)

Return a list of errors belonging to invalid mirror records.

Module contents

INSPIRE migrator module.