[BETA] added the deployment instruction in the jenkins file for beta related to the creation of the action set to extend the results with the open apc transformative agreemnt file information — Miriam Baglioni / detail
[BETA] added the deployment instruction in the jenkins file for beta related to the creation of the action set to include the results tagged with FoS without a doi — Miriam Baglioni / detail
[DUMP] added Jenkins file for the deployment of the dumps — Miriam Baglioni / detail

#89 (Apr 16, 2024 2:23:45 PM)

Generate tables with parquet-files, instead of csv, in "dhp-stats-update/.../contexts.sh" script. — Lampros Smyrnaios / detail
added missing EOS — Antonis Lempesis / detail
fixed typo in indicator query — Antonis Lempesis / detail
fixed the result_country definition — Antonis Lempesis / detail
added new orgs in monitor — antleb / detail
- Update the code which acquires the "IMPALA_HDFS_NODE", to test the "tmp"-dir, instead of the base-dir and introduce retries, to overcome potential file-system failures. This change was suggested by "Sebastian Tymkow" and "Grzegorz Bakalarski". — Lampros Smyrnaios / detail
Extended mapping of funder from crossref (#9169, #9277) and change the correspondece files for the irish fundrs (#9635). Extended the datacite map to include the association between metadata and the EBRAINS datasource (SciLake) — Miriam Baglioni / detail
Upgrade the copying operation to Impala Cluster: — Lampros Smyrnaios / detail
Use the "HADOOP_USER_NAME" value from the "workflow-property", in "copyDataToImpalaCluster.sh", in "stats-monitor-updates". — Lampros Smyrnaios / detail
Miscellaneous updates to the copying operation to Impala Cluster: — Lampros Smyrnaios / detail
Minor updates to the copying operation to Impala Cluster: — Lampros Smyrnaios / detail

#88 (Mar 26, 2024 2:01:05 PM)

Updated deployments of orcid collection from api — Sandro La Bruzzo / detail
Change DNET_HADOOP_REPO_BRANCH default value to "beta" for BETA pipelines — Giambattista Bloisi / detail
added deployment specs for dhp-stats-monitor-update, dhp-stats-monitor-irish, dhp-stats-hist-snaps — Claudio Atzori / detail
removed deployment spec for dhp-stats-monitor-update — Claudio Atzori / detail
added deployment specs for dhp-stats-monitor-irish, dhp-stats-hist-snaps — Claudio Atzori / detail

#88 (Mar 26, 2024 2:01:05 PM)

Changes to indicators and funders definition — dpierrakos / detail
Monitor Irish Stats WF — dpierrakos / detail
Historical Snapshots Workflow — dpierrakos / detail
Update buildIrishMonitorDB.sql — dpierrakos / detail
fixed the result_country definition — Antonis Lempesis / detail
Changes to beta db names — dpierrakos / detail
Changes to indicators — dpierrakos / detail
creating result_instances even when no pids exist for the instance — Antonis Lempesis / detail
max mem of joins (hive.mapjoin.followby.gby.localtask.max.memory.usage) now 80%, up from 55%. — Antonis Lempesis / detail
Changed step16-createIndicatorsTables to use a spark oozie action instead of hive — antleb / detail
Use SparkSQL in place of Hive for executing step16-createIndicatorsTables.sql of stats update wf — Giambattista Bloisi / detail
changed orcid ids to all capital — antleb / detail
added 2 new institutions in monitor — antleb / detail
mapping of project PIDs — Michele Artini / detail
[BulkTagging] added check to verify if field is present in the pathMap — Miriam Baglioni / detail
Use SparkSQL in place of Hive for executing step16-createIndicatorsTables.sql of stats update wf — Claudio Atzori / detail
using distinct apcs per publication to avoid huge sums — Antonis Lempesis / detail
fixed the irish result subset — antleb / detail
selecting distinct peer_reviewed — antleb / detail
new plugin to collect from a dump of BASE — Michele Artini / detail
comments — Michele Artini / detail
updated sql query for filtering BASE records — Michele Artini / detail
filter by base types — Michele Artini / detail
Commit monitor-updates-wf — dpierrakos / detail
code cleanup — Antonis Lempesis / detail
code cleanup — antleb / detail
code cleanup — antleb / detail
openorgs wf updated — miconis / detail
default parameters for openorgs updated — miconis / detail
Use the ACTIVE HDFS NODE for Impala cluster, in "copyDataToImpalaCluster.sh" script. — Lampros Smyrnaios / detail
Automatically select the ACTIVE HDFS NODE for Impala cluster, in all "copyDataToImpalaCluster.sh" scripts. — Lampros Smyrnaios / detail
resolving conflicts on step16-createIndicatorsTables.sql — Claudio Atzori / detail
adjusted pom files — Claudio Atzori / detail

#86 (Feb 21, 2024 2:29:49 PM)

Fixed problem on missing author in crossref Mapping — Sandro La Bruzzo / detail
[orcid-enrichment] change the value of parameters. — Miriam Baglioni / detail
code formatting — Claudio Atzori / detail
[orcid enrichment] fixed directory cleanup before distcp — Claudio Atzori / detail
[graph cleaning] rule out datasources without an officialname — Claudio Atzori / detail
[actiosets] introduced support for the PromoteAction strategy — Claudio Atzori / detail
[actiosets] fixed join type — Claudio Atzori / detail
Dedup aliases, created when a dedup in a previous build has been merged in a new dedup, need to be marked as "deletedbyinference", since they are "merged" in the new dedup — Giambattista Bloisi / detail
[graph raw] fixed mapping of the original resource type from the Datacite format — Claudio Atzori / detail
fixed import of ORPs stored on HDFS in the internal graph format (e.g. Datacite) — Claudio Atzori / detail

#85 (Jan 26, 2024 4:19:42 PM)

added deployment procedures for biodb_aggregation, ebi_links_aggregation, pubmed_aggregation — Claudio Atzori / detail
enrichment with subworkflows — Claudio Atzori / detail
added workflow for updating the dedup pivot history database — Claudio Atzori / detail
updated deployment spec for PROD — Claudio Atzori / detail

#85 (Jan 26, 2024 4:19:42 PM)

first version of the workflow single step — Miriam Baglioni / detail
adjusting workflow definition — Miriam Baglioni / detail
removed not needed parameter — Miriam Baglioni / detail
- — Miriam Baglioni / detail
[doiboost - preprocess] remove transition to orcid preparation from sequence of steps at the beginning of the workflow — Miriam Baglioni / detail
- — Miriam Baglioni / detail
updated the transformation Baseline workflow to include mdstore rollback/commit action — Sandro La Bruzzo / detail
updated the transformation Baseline workflow to include mdstore rollback/commit action — Sandro La Bruzzo / detail
uploaded input parameters on CreateBaseline WF — Sandro La Bruzzo / detail
uploaded input parameters on CreateBaseline WF — Sandro La Bruzzo / detail
updated workflow for generation of Scholix Datasource's to use mdstore transactions — Sandro La Bruzzo / detail
added needed parameter — Miriam Baglioni / detail
- — Miriam Baglioni / detail
refactoring after compiletion — Miriam Baglioni / detail
added metaresourcetype to the result hive DB view — Claudio Atzori / detail
added metaresourcetype to the result hive DB view — Claudio Atzori / detail
adjustments for country propagation — Miriam Baglioni / detail
adding the bulkTag parameter file in the folder for the oozie workflow for bulkTagging. Changes the path in the class — Miriam Baglioni / detail
changed the path to the parameter file in the class for entitytoorganization propagation — Miriam Baglioni / detail
added properties file in the forlder for the workflow of orcid propagation. Changes the path in the classes implementing the propagationchanged the path to the parameter file in the class for entitytoorganization propagation — Miriam Baglioni / detail
changed in the classes the path for the property files for the propagation of community from project — Miriam Baglioni / detail
added properties file in the forlder for the workflow of project to result propagation. Changes the path in the classes implementing the propagation — Miriam Baglioni / detail
added properties file in the forlder for the workflow of result to community from organization propagation. Changes the path in the classes implementing the propagation — Miriam Baglioni / detail
added properties file in the forlder for the workflow of result to community from semrel propagation. Changes the path in the classes implementing the propagation — Miriam Baglioni / detail
added properties file in the forlder for the workflow of result to organization from inst repo propagation. Changes the path in the classes implementing the propagation — Miriam Baglioni / detail
SparkCreateSimRels: — Giambattista Bloisi / detail
Do no longer use dedupId information from pivotHistory Database — Giambattista Bloisi / detail
Generate "merged" dedup id relations also for records that are filtered out by the cut parameters — Giambattista Bloisi / detail
Use dedup_wf_002 in place of dedup_wf_001 to make explicit a different algorithm has been used to generate those kind of ids — Giambattista Bloisi / detail
Create dedup record for "merged" pivots — Giambattista Bloisi / detail
refined mapping for the extraction of the original resource type — Claudio Atzori / detail
fix issue on FoS integration. Removing the null values from FoS — Miriam Baglioni / detail
Reusable RunSQLSparkJob for executing SQL in Spark through Oozie Spark Actions — Giambattista Bloisi / detail
[enrichment single step] refactoring to fix issue in disappeared result type — Miriam Baglioni / detail
[enrichment single step] refactoring to fix issues in disappeared result type — Miriam Baglioni / detail
[enrichment single step] remove parameter from execution — Miriam Baglioni / detail
- — Miriam Baglioni / detail
[enrichment single step] moving parameter file in correct location — Miriam Baglioni / detail
[enrichment single step] adding <end> element in wf definition — Miriam Baglioni / detail
increased shuffle partitions for publications in the country propagation workflow — Claudio Atzori / detail
[orcid enrichment] drop paths before copying the non-modifyed contents — Claudio Atzori / detail
[graph provision] obtain context info from the context API instead from the ISLookUp service — Claudio Atzori / detail
code formatting — Claudio Atzori / detail
[graph provision] updated param specification for the XML converter job — Claudio Atzori / detail
[collection] increased logging from the oai-pmh metadata collection process — Claudio Atzori / detail
[graph provision] retrieve all the context information by adding all=true to the requests issued to thr API — Claudio Atzori / detail
added code of conduct and contributing files — Claudio Atzori / detail
minor — Claudio Atzori / detail
Update 'CONTRIBUTING.md' — Claudio Atzori / detail
max mem of joins (hive.mapjoin.followby.gby.localtask.max.memory.usage) now 80%, up from 55%. — Claudio Atzori / detail
[collection] increased logging from the oai-pmh metadata collection process — Claudio Atzori / detail
Fixed problem on missing author in crossref Mapping — Sandro La Bruzzo / detail

#84 (Dec 15, 2023 11:47:10 AM)

added step resulttocommunityfromproject to the BETA deployment — Claudio Atzori / detail
added deploy specs for stats_actionset, download_orcid_dump, horizontal orcid enrichment — Claudio Atzori / detail
switched stage names — Claudio Atzori / detail
added deployment specs for download_orcid_dump, update_actionset_statsdb, orcidEnrichment — Claudio Atzori / detail

#84 (Dec 15, 2023 11:47:10 AM)

changes to use the API instead of the IS the get the information for the communities to be used during bulktagging and context propagation — Miriam Baglioni / detail
refactoring — Miriam Baglioni / detail
[raw graph] adopting the new COAR based vocabularies for the resource typing — Claudio Atzori / detail
used the API instead of the IS for bulktagging and propagation for community through organization. Added a new propagation step for communities through projects. Still using the API and not the IS — Miriam Baglioni / detail
[raw graph] WIP: mapping original resource types — Claudio Atzori / detail
testing and fix some issues — Miriam Baglioni / detail
new spark parrameter updated — Sandro La Bruzzo / detail
[raw graph] mapping original resource types — Claudio Atzori / detail
more NPE checks — Claudio Atzori / detail
[graph raw] URL Validator to accept double slashes — Claudio Atzori / detail
Add actionset creation for pubmed affiliations — Serafeim Chatzopoulos / detail
fixing issue on propagation organization. added --config to workflow definition. added oozie_app to communtiy project — Miriam Baglioni / detail
Change the description of the workflow — Serafeim Chatzopoulos / detail
StatsDB workflow to export actionsets about OA routes, diamond, and publicly-funded — dpierrakos / detail
Renaming input param for crossref input path — Serafeim Chatzopoulos / detail
Adjust tests to new WF input params — Serafeim Chatzopoulos / detail
[graph cleaning] implemented further suggestions from https://support.openaire.eu/issues/8898 — Claudio Atzori / detail
[graph cleaning] cleanup — Claudio Atzori / detail
test for project propagation — Miriam Baglioni / detail
removed not needed test class — Miriam Baglioni / detail
- — Miriam Baglioni / detail
refactoring and test — Miriam Baglioni / detail
changing test for new implementation — Miriam Baglioni / detail
refactoring — Miriam Baglioni / detail
- — Miriam Baglioni / detail
Clear working dir in bipranker workflow — Serafeim Chatzopoulos / detail
Changes to actionsets — dpierrakos / detail
Implemented ORCID Workflow on DHP-Aggregation for retrieving ORCID DUMP and generating tables — Sandro La Bruzzo / detail
- — Miriam Baglioni / detail
Changes for tables and creation of the new indicator indi_is_result_accessible — dpierrakos / detail
[graph cleaning] applying coar based vocabularies in bulk — Claudio Atzori / detail
Update StatsAtomicActionsJob.java — dpierrakos / detail
[graph cleaning] added cleaning for result.publisher and result.instance.license — Claudio Atzori / detail
code formatting — Claudio Atzori / detail
Implemented ORCID Enrichment — Sandro La Bruzzo / detail
changed the parameter from production to baseURL. Fixed issue in tagging configuration — Miriam Baglioni / detail
refactoring — Miriam Baglioni / detail
Implemented Author MErger for ORCID that takes in account the case when name and surname are swapped — Sandro La Bruzzo / detail
added comment — Sandro La Bruzzo / detail
Changed implementation of check similarity to verify exact match of name instead of the first char — Sandro La Bruzzo / detail
added test — Sandro La Bruzzo / detail
added instanceTypeMapping original field in the mapping of — Sandro La Bruzzo / detail
added vocabulary in instanceTypeMapping for — Sandro La Bruzzo / detail
removed Orcid intersection on DOIBoost — Sandro La Bruzzo / detail
Added copy of the untouched entities of the graph — Sandro La Bruzzo / detail
code formatting — Sandro La Bruzzo / detail
Update StatsAtomicActionsJob.java — dpierrakos / detail
Removed unused function — Sandro La Bruzzo / detail
Changes to indicators — dpierrakos / detail
Add new indicator — dpierrakos / detail
New institutions added — dpierrakos / detail
using objectSubType as originalType in Crossref2Oaf, code formatting — Claudio Atzori / detail
code formatting — Claudio Atzori / detail
fixed doiboost process workflow, removed references to the ProcessORCID step — Claudio Atzori / detail
Extracted the correct original type to pass to instanceTypeMapping in Crossref Mapping — Sandro La Bruzzo / detail
code formatting — Claudio Atzori / detail
[graph grouping] added isLookupUrl to the workflow definition, passed to the grouping spark aciton — Claudio Atzori / detail
avoid NPEs in Vocabulary.getTermBySynonym — Claudio Atzori / detail
avoid NPEs — Claudio Atzori / detail
avoid NPEs — Claudio Atzori / detail
[bulktagging] fixed workflow parameters — Claudio Atzori / detail
[community_organization propagation] fixed workflow parameters — Claudio Atzori / detail
added serialization for the new fields imported for the Irish tender — Claudio Atzori / detail
[dedup] added isLookupUrl to the graph consistency workflow definition, required now by the entity grouping phase — Claudio Atzori / detail
[orcid enrichment] fixed workflow definition — Claudio Atzori / detail
[bulktagging] setting first step of bulktaggin as the copy of the entities and relations not involved in the tagging' — Miriam Baglioni / detail
[community_result_propagation] adjusting starting poit of workflow — Miriam Baglioni / detail
[enrichment] passing the community API base URL — Claudio Atzori / detail
logging typo — Claudio Atzori / detail
[graph cleaning] avoid stack overflow error when navigating Oaf objects declaring an Enum — Claudio Atzori / detail
code formatting — Claudio Atzori / detail
[graph provision] added tests for the new model fields — Claudio Atzori / detail
[cleaning] allow enriched orcids to pass the cleaning, rule out non-orcid author pids — Claudio Atzori / detail
code formatting — Claudio Atzori / detail
[graph provision] added tests for new peerreviewed field — Claudio Atzori / detail