inspirehep.modules.hal package

Submodules

inspirehep.modules.hal.bulk_push module

IMPORTANT This script is a copy/paste of: https://github.com/inspirehep/inspire-next/issues/2629

It is unreliable and absolutely unmaintainable. It will be refactored with this user story: https://its.cern.ch/jira/browse/INSPIR-249

To be run with: $ /usr/bin/time -v inspirehep hal push

inspirehep.modules.hal.bulk_push.run(username, password, limit, yield_amt)[source]

inspirehep.modules.hal.cli module

inspirehep.modules.hal.config module

HAL configuration.

inspirehep.modules.hal.config.HAL_COL_IRI = 'https://api-preprod.archives-ouvertes.fr/sword/hal'

IRI used by the SWORD protocol when creating a new record on HAL.

Note

Use this to send records to their staging instance. To send records to their production instance use the same IRI without -preprod.

inspirehep.modules.hal.config.HAL_DOMAIN_MAPPING = {'Instrumentation': 'phys.phys.phys-ins-det', 'Data Analysis and Statistics': 'phys.phys.phys-data-an', 'Experiment-Nucl': 'phys.nexp', 'Math and Math Physics': 'phys.mphy', 'Theory-HEP': 'phys.hthe', 'Theory-Nucl': 'phys.nucl', 'Lattice': 'phys.hlat', 'Other': 'phys', 'Astrophysics': 'phys.astr', 'General Physics': 'phys.phys.phys-gen-ph', 'Experiment-HEP': 'phys.hexp', 'Computing': 'info', 'Phenomenology-HEP': 'phys.hphe', 'Gravitation and Cosmology': 'phys.grqc', 'Accelerators': 'phys.phys.phys-acc-ph'}

Mapping used when converting from INSPIRE categories to HAL domains.

inspirehep.modules.hal.config.HAL_EDIT_IRI = 'https://api-preprod.archives-ouvertes.fr/sword/'

IRI used by the SWORD protocol when updating an existing record on HAL.

Note

Use this to update records on their staging instance. To update records on their production instance use the same IRI without -preprod.

inspirehep.modules.hal.config.HAL_IGNORE_CERTIFICATES = False

Whether to check certificates when connecting to HAL.

inspirehep.modules.hal.config.HAL_USER_NAME = 'hal_user_name'

Name of the INSPIRE user on HAL.

Note

Its real value is stored in tbag. In particular QA_HAL_USER_NAME contains the value to use for their staging instance, while PROD_HAL_USER_NAME contains the value to use for their production instance.

inspirehep.modules.hal.config.HAL_USER_PASS = 'hal_user_pass'

Password of the INSPIRE user on HAL.

Note

Its real value is stored in tbag. In particular QA_HAL_USER_PASS contains the value to use for their staging instance, while PROD_HAL_USER_PASS contains the value to use for their production instance.

inspirehep.modules.hal.ext module

HAL extension.

class inspirehep.modules.hal.ext.InspireHAL(app=None)[source]

Bases: object

init_app(app)[source]
init_config(app)[source]

inspirehep.modules.hal.tasks module

HAL tasks.

(task)inspirehep.modules.hal.tasks.hal_push[source]

Run a hal push.

inspirehep.modules.hal.tasks.send_hal_push_start_email(mailing_list)[source]
inspirehep.modules.hal.tasks.send_hal_push_summary_email(mailing_list, total, ok, now, ko, attached_files=None)[source]

Sends a nice email with the summary of the hal push.

inspirehep.modules.hal.utils module

HAL utils.

inspirehep.modules.hal.utils.get_authors(record)[source]

Return the authors of a record.

Queries the Institution records linked from the authors affiliations to add, whenever it exists, the HAL identifier of the institution to the affiliation.

Parameters:record (InspireRecord) – a record.
Returns:the authors of the record.
Return type:list(dict)

Examples

>>> record = {
...     'authors': [
...         'affiliations': [
...             {
...                 'record': {
...                     '$ref': 'http://localhost:5000/api/institutions/902725',
...                 }
...             },
...         ],
...     ],
... }
>>> authors = get_authors(record)
>>> authors[0]['hal_id']
'300037'
inspirehep.modules.hal.utils.get_conference_city(record)[source]

Return the first city of a Conference record.

Parameters:record (InspireRecord) – a Conference record.
Returns:the first city of the Conference record.
Return type:string

Examples

>>> record = {'address': [{'cities': ['Tokyo']}]}
>>> get_conference_city(record)
'Tokyo'
inspirehep.modules.hal.utils.get_conference_country(record)[source]

Return the first country of a Conference record.

Parameters:record (InspireRecord) – a Conference record.
Returns:the first country of the Conference record.
Return type:string

Examples

>>> record = {'address': [{'country_code': 'JP'}]}
>>> get_conference_country(record)
'jp'
inspirehep.modules.hal.utils.get_conference_end_date(record)[source]

Return the closing date of a conference record.

Parameters:record (InspireRecord) – a Conference record.
Returns:the closing date of the Conference record.
Return type:string

Examples

>>> record = {'closing_date': '1999-11-19'}
>>> get_conference_end_date(record)
'1999-11-19'
inspirehep.modules.hal.utils.get_conference_record(record, default=None)[source]

Return the first Conference record associated with a record.

Queries the database to fetch the first Conference record referenced in the publication_info of the record.

Parameters:
  • record (InspireRecord) – a record.
  • default – value to be returned if no conference record present/found
Returns:

the first Conference record associated with the record.

Return type:

InspireRecord

Examples

>>> record = {
...     'publication_info': [
...         {
...             'conference_record': {
...                 '$ref': '/api/conferences/972464',
...             },
...         },
...     ],
... }
>>> conference_record = get_conference_record(record)
>>> conference_record['control_number']
972464
inspirehep.modules.hal.utils.get_conference_start_date(record)[source]

Return the opening date of a conference record.

Parameters:record (InspireRecord) – a Conference record.
Returns:the opening date of the Conference record.
Return type:string

Examples

>>> record = {'opening_date': '1999-11-16'}
>>> get_conference__start_date(record)
'1999-11-16'
inspirehep.modules.hal.utils.get_conference_title(record, default='')[source]

Return the first title of a Conference record.

Parameters:record (InspireRecord) – a Conference record.
Returns:the first title of the Conference record.
Return type:string

Examples

>>> record = {'titles': [{'title': 'Workshop on Neutrino Physics'}]}
>>> get_conference_title(record)
'Workshop on Neutrino Physics'
inspirehep.modules.hal.utils.get_divulgation(record)[source]

Return 1 if a record is intended for the general public, 0 otherwise.

Parameters:record (InspireRecord) – a record.
Returns:1 if the record is intended for the general public, 0 otherwise.
Return type:int

Examples

>>> get_divulgation({'publication_type': ['introductory']})
1
inspirehep.modules.hal.utils.get_document_types(record)[source]

Return all document types of a record.

Parameters:record (InspireRecord) – a record.
Returns:all document types of the record.
Return type:list(str)

Examples

>>> get_document_types({'document_type': ['article']})
['article']
inspirehep.modules.hal.utils.get_doi(record)[source]

Return the first DOI of a record.

Parameters:record (InspireRecord) – a record.
Returns:the first DOI of the record.
Return type:string

Examples

>>> get_doi({'dois': [{'value': '10.1016/0029-5582(61)90469-2'}]})
'10.1016/0029-5582(61)90469-2'
inspirehep.modules.hal.utils.get_domains(record)[source]

Return the HAL domains of a record.

Uses the mapping in the configuration to convert all INSPIRE categories to the corresponding HAL domains.

Parameters:record (InspireRecord) – a record.
Returns:the HAL domains of the record.
Return type:list(str)

Examples

>>> record = {'inspire_categories': [{'term': 'Experiment-HEP'}]}
>>> get_domains(record)
['phys.hexp']
inspirehep.modules.hal.utils.get_inspire_id(record)[source]

Return the INSPIRE id of a record.

Parameters:record (InspireRecord) – a record.
Returns:the INSPIRE id of the record.
Return type:int

Examples

>>> get_inspire_id({'control_number': 1507156})
1507156
inspirehep.modules.hal.utils.get_journal_issue(record)[source]

Return the issue of the journal a record was published into.

Parameters:record (InspireRecord) – a record.
Returns:the issue of the journal the record was published into.
Return type:string

Examples

>>> record = {
...    'publication_info': [
...        {'journal_issue': '5'},
...    ],
... }
>>> get_journal_issue(record)
'5'
inspirehep.modules.hal.utils.get_journal_title(record)[source]

Return the title of the journal a record was published into.

Parameters:record (InspireRecord) – a record.
Returns:the title of the journal the record was published into.
Return type:string

Examples

>>> record = {
...     'publication_info': [
...         {'journal_title': 'Phys.Part.Nucl.Lett.'},
...     ],
... }
>>> get_journal_title(record)
'Phys.Part.Nucl.Lett.'
inspirehep.modules.hal.utils.get_journal_volume(record)[source]

Return the volume of the journal a record was published into.

Parameters:record (InspireRecord) – a record.
Returns:the volume of the journal the record was published into.
Return type:string

Examples

>>> record = {
...     'publication_info': [
...         {'journal_volume': 'D94'},
...     ],
... }
>>> get_journal_volume(record)
'D94'
inspirehep.modules.hal.utils.get_language(record)[source]

Return the first language of a record.

If it is not specified in the record we assume that the language is English, so we return 'en'.

Parameters:record (InspireRecord) – a record.
Returns:the first language of the record.
Return type:string

Examples

>>> get_language({'languages': ['it']})
'it'
inspirehep.modules.hal.utils.get_page_artid(record, separator='-')[source]

Return the page range or the article id of a record.

Parameters:
  • record (InspireRecord) – a record
  • separator (basestring) – optional page range symbol, defaults to a single dash
Returns:

the page range or the article id of the record.

Return type:

string

Examples

>>> record = {
...     'publication_info': [
...         {'artid': '054021'},
...     ],
... }
>>> get_page_artid(record)
'054021'
inspirehep.modules.hal.utils.get_page_artid_for_publication_info(publication_info, separator)[source]

Return the page range or the article id of a publication_info entry.

Parameters:
  • publication_info (dict) – a publication_info field entry of a record
  • separator (basestring) – optional page range symbol, defaults to a single dash
Returns:

the page range or the article id of the record.

Return type:

string

Examples

>>> publication_info = {'artid': '054021'}
>>> get_page_artid(publication_info)
'054021'
inspirehep.modules.hal.utils.get_peer_reviewed(record)[source]

Return 1 if a record is peer reviewed, 0 otherwise.

Parameters:record (InspireRecord) – a record.
Returns:1 if the record is peer reviewed, 0 otherwise.
Return type:int

Examples

>>> get_peer_reviewed({'refereed': True})
1
inspirehep.modules.hal.utils.get_publication_date(record)[source]

Return the date in which a record was published.

Parameters:record (InspireRecord) – a record.
Returns:the date in which the record was published.
Return type:string

Examples

>>> get_publication_date({'publication_info': [{'year': 2017}]})
'2017'
inspirehep.modules.hal.utils.is_published(record)[source]

Return if a record is published.

We say that a record is published if it is citeable, which means that it has enough information in a publication_info, or if we know its DOI and a journal_title, which means it is in press.

Parameters:record (InspireRecord) – a record.
Returns:whether the record is published.
Return type:bool

Examples

>>> record = {
...     'dois': [
...         {'value': '10.1016/0029-5582(61)90469-2'},
...     ],
...     'publication_info': [
...         {'journal_title': 'Nucl.Phys.'},
...     ],
... }
>>> is_published(record)
True

inspirehep.modules.hal.views module

HAL views.

Module contents

HAL module.

This module converts INSPIRE literature records to the XML+TEI format supported by Hyper Articles en Ligne (HAL), a French open archive of scholarly documents.

The Jinja2 Python library is used to convert records into a HAL-supported format, after which the Python SWORD client posts these records to the HAL SWORD API.