Changes

#33 (Feb 21, 2024 2:04:23 PM)

  1. Fixed problem on missing author in crossref Mapping — Sandro La Bruzzo / detail
  2. [orcid-enrichment] change the value of parameters. — Miriam Baglioni / detail
  3. code formatting — Claudio Atzori / detail
  4. [orcid enrichment] fixed directory cleanup before distcp — Claudio Atzori / detail
  5. [graph cleaning] rule out datasources without an officialname — Claudio Atzori / detail
  6. [actiosets] introduced support for the PromoteAction strategy — Claudio Atzori / detail
  7. [actiosets] fixed join type — Claudio Atzori / detail
  8. Dedup aliases, created when a dedup in a previous build has been merged in a new dedup, need to be marked as "deletedbyinference", since they are "merged" in the new dedup — Giambattista Bloisi / detail
  9. [graph raw] fixed mapping of the original resource type from the Datacite format — Claudio Atzori / detail
  10. fixed import of ORPs stored on HDFS in the internal graph format (e.g. Datacite) — Claudio Atzori / detail

#32 (Jan 29, 2024 9:04:18 AM)

  1. enrichment with subworkflows — Claudio Atzori / detail
  2. added workflow for updating the dedup pivot history database — Claudio Atzori / detail
  3. updated deployment spec for PROD — Claudio Atzori / detail

#32 (Jan 29, 2024 9:04:18 AM)

  1. first version of the workflow single step — Miriam Baglioni / detail
  2. adjusting workflow definition — Miriam Baglioni / detail
  3. removed not needed parameter — Miriam Baglioni / detail
  4. - — Miriam Baglioni / detail
  5. [doiboost - preprocess] remove transition to orcid preparation from sequence of steps at the beginning of the workflow — Miriam Baglioni / detail
  6. - — Miriam Baglioni / detail
  7. updated the transformation Baseline workflow to include mdstore rollback/commit action — Sandro La Bruzzo / detail
  8. uploaded input parameters on CreateBaseline WF — Sandro La Bruzzo / detail
  9. updated workflow for generation of Scholix Datasource's to use mdstore transactions — Sandro La Bruzzo / detail
  10. added needed parameter — Miriam Baglioni / detail
  11. - — Miriam Baglioni / detail
  12. refactoring after compiletion — Miriam Baglioni / detail
  13. added metaresourcetype to the result hive DB view — Claudio Atzori / detail
  14. added metaresourcetype to the result hive DB view — Claudio Atzori / detail
  15. adjustments for country propagation — Miriam Baglioni / detail
  16. adding the bulkTag parameter file in the folder for the oozie workflow for bulkTagging. Changes the path in the class — Miriam Baglioni / detail
  17. changed the path to the parameter file in the class for entitytoorganization propagation — Miriam Baglioni / detail
  18. added properties file in the forlder for the workflow of orcid propagation. Changes the path in the classes implementing the propagationchanged the path to the parameter file in the class for entitytoorganization propagation — Miriam Baglioni / detail
  19. changed in the classes the path for the property files for the propagation of community from project — Miriam Baglioni / detail
  20. added properties file in the forlder for the workflow of project to result propagation. Changes the path in the classes implementing the propagation — Miriam Baglioni / detail
  21. added properties file in the forlder for the workflow of result to community from organization propagation. Changes the path in the classes implementing the propagation — Miriam Baglioni / detail
  22. added properties file in the forlder for the workflow of result to community from semrel propagation. Changes the path in the classes implementing the propagation — Miriam Baglioni / detail
  23. added properties file in the forlder for the workflow of result to organization from inst repo propagation. Changes the path in the classes implementing the propagation — Miriam Baglioni / detail
  24. SparkCreateSimRels: — Giambattista Bloisi / detail
  25. Do no longer use dedupId information from pivotHistory Database — Giambattista Bloisi / detail
  26. Generate "merged" dedup id relations also for records that are filtered out by the cut parameters — Giambattista Bloisi / detail
  27. Use dedup_wf_002 in place of dedup_wf_001 to make explicit a different algorithm has been used to generate those kind of ids — Giambattista Bloisi / detail
  28. Create dedup record for "merged" pivots — Giambattista Bloisi / detail
  29. refined mapping for the extraction of the original resource type — Claudio Atzori / detail
  30. fix issue on FoS integration. Removing the null values from FoS — Miriam Baglioni / detail
  31. Reusable RunSQLSparkJob for executing SQL in Spark through Oozie Spark Actions — Giambattista Bloisi / detail
  32. [enrichment single step] refactoring to fix issue in disappeared result type — Miriam Baglioni / detail
  33. [enrichment single step] refactoring to fix issues in disappeared result type — Miriam Baglioni / detail
  34. [enrichment single step] remove parameter from execution — Miriam Baglioni / detail
  35. - — Miriam Baglioni / detail
  36. [enrichment single step] moving parameter file in correct location — Miriam Baglioni / detail
  37. [enrichment single step] adding <end> element in wf definition — Miriam Baglioni / detail
  38. increased shuffle partitions for publications in the country propagation workflow — Claudio Atzori / detail
  39. [orcid enrichment] drop paths before copying the non-modifyed contents — Claudio Atzori / detail
  40. [graph provision] obtain context info from the context API instead from the ISLookUp service — Claudio Atzori / detail
  41. code formatting — Claudio Atzori / detail
  42. [graph provision] updated param specification for the XML converter job — Claudio Atzori / detail
  43. [collection] increased logging from the oai-pmh metadata collection process — Claudio Atzori / detail
  44. [graph provision] retrieve all the context information by adding all=true to the requests issued to thr API — Claudio Atzori / detail
  45. added code of conduct and contributing files — Claudio Atzori / detail
  46. minor — Claudio Atzori / detail
  47. Update 'CONTRIBUTING.md' — Claudio Atzori / detail
  48. max mem of joins (hive.mapjoin.followby.gby.localtask.max.memory.usage) now 80%, up from 55%. — Claudio Atzori / detail
  49. [collection] increased logging from the oai-pmh metadata collection process — Claudio Atzori / detail
  50. Fixed problem on missing author in crossref Mapping — Sandro La Bruzzo / detail

#31 (Dec 18, 2023 2:01:28 PM)

  1. updated the transformation Baseline workflow to include mdstore rollback/commit action — Sandro La Bruzzo / detail
  2. uploaded input parameters on CreateBaseline WF — Sandro La Bruzzo / detail

#30 (Dec 18, 2023 11:17:27 AM)

  1. added deployment procedures for biodb_aggregation, ebi_links_aggregation, pubmed_aggregation — Claudio Atzori / detail

#29 (Dec 15, 2023 11:26:17 AM)

  1. added step resulttocommunityfromproject — Claudio Atzori / detail
  2. renamed step resulttocommunityfromproject — Claudio Atzori / detail
  3. added step resulttocommunityfromproject to the BETA deployment — Claudio Atzori / detail
  4. added deploy specs for stats_actionset, download_orcid_dump, horizontal orcid enrichment — Claudio Atzori / detail
  5. switched stage names — Claudio Atzori / detail
  6. added deployment specs for download_orcid_dump, update_actionset_statsdb, orcidEnrichment — Claudio Atzori / detail

#29 (Dec 15, 2023 11:26:17 AM)

  1. changes to use the API instead of the IS the get the information for the communities to be used during bulktagging and context propagation — Miriam Baglioni / detail
  2. refactoring — Miriam Baglioni / detail
  3. [raw graph] adopting the new COAR based vocabularies for the resource typing — Claudio Atzori / detail
  4. used the API instead of the IS for bulktagging and propagation for community through organization. Added a new propagation step for communities through projects. Still using the API and not the IS — Miriam Baglioni / detail
  5. [raw graph] WIP: mapping original resource types — Claudio Atzori / detail
  6. testing and fix some issues — Miriam Baglioni / detail
  7. new spark parrameter updated — Sandro La Bruzzo / detail
  8. [raw graph] mapping original resource types — Claudio Atzori / detail
  9. more NPE checks — Claudio Atzori / detail
  10. [graph raw] URL Validator to accept double slashes — Claudio Atzori / detail
  11. Add actionset creation for pubmed affiliations — Serafeim Chatzopoulos / detail
  12. fixing issue on propagation organization. added --config to workflow definition. added oozie_app to communtiy project — Miriam Baglioni / detail
  13. Change the description of the workflow — Serafeim Chatzopoulos / detail
  14. StatsDB workflow to export actionsets about OA routes, diamond, and publicly-funded — dpierrakos / detail
  15. Renaming input param for crossref input path — Serafeim Chatzopoulos / detail
  16. Adjust tests to new WF input params — Serafeim Chatzopoulos / detail
  17. [graph cleaning] implemented further suggestions from https://support.openaire.eu/issues/8898 — Claudio Atzori / detail
  18. [graph cleaning] cleanup — Claudio Atzori / detail
  19. test for project propagation — Miriam Baglioni / detail
  20. removed not needed test class — Miriam Baglioni / detail
  21. - — Miriam Baglioni / detail
  22. refactoring and test — Miriam Baglioni / detail
  23. changing test for new implementation — Miriam Baglioni / detail
  24. refactoring — Miriam Baglioni / detail
  25. - — Miriam Baglioni / detail
  26. Clear working dir in bipranker workflow — Serafeim Chatzopoulos / detail
  27. Changes to actionsets — dpierrakos / detail
  28. Implemented ORCID Workflow on DHP-Aggregation for retrieving ORCID DUMP and generating tables — Sandro La Bruzzo / detail
  29. - — Miriam Baglioni / detail
  30. Project propagation via communityAPI instead of using IS via IIS — Miriam Baglioni / detail
  31. Changes for tables and creation of the new indicator indi_is_result_accessible — dpierrakos / detail
  32. [graph cleaning] applying coar based vocabularies in bulk — Claudio Atzori / detail
  33. Update StatsAtomicActionsJob.java — dpierrakos / detail
  34. [graph cleaning] added cleaning for result.publisher and result.instance.license — Claudio Atzori / detail
  35. code formatting — Claudio Atzori / detail
  36. Implemented ORCID Enrichment — Sandro La Bruzzo / detail
  37. changed the parameter from production to baseURL. Fixed issue in tagging configuration — Miriam Baglioni / detail
  38. refactoring — Miriam Baglioni / detail
  39. Implemented Author MErger for ORCID that takes in account the case when name and surname are swapped — Sandro La Bruzzo / detail
  40. added comment — Sandro La Bruzzo / detail
  41. Changed implementation of check similarity to verify exact match of name instead of the first char — Sandro La Bruzzo / detail
  42. added test — Sandro La Bruzzo / detail
  43. added instanceTypeMapping original field in the mapping of — Sandro La Bruzzo / detail
  44. added vocabulary in instanceTypeMapping for — Sandro La Bruzzo / detail
  45. removed Orcid intersection on DOIBoost — Sandro La Bruzzo / detail
  46. Added copy of the untouched entities of the graph — Sandro La Bruzzo / detail
  47. code formatting — Sandro La Bruzzo / detail
  48. Update StatsAtomicActionsJob.java — dpierrakos / detail
  49. Removed unused function — Sandro La Bruzzo / detail
  50. Changes to indicators — dpierrakos / detail
  51. Add new indicator — dpierrakos / detail
  52. New institutions added — dpierrakos / detail
  53. using objectSubType as originalType in Crossref2Oaf, code formatting — Claudio Atzori / detail
  54. code formatting — Claudio Atzori / detail
  55. fixed doiboost process workflow, removed references to the ProcessORCID step — Claudio Atzori / detail
  56. Extracted the correct original type to pass to instanceTypeMapping in Crossref Mapping — Sandro La Bruzzo / detail
  57. code formatting — Claudio Atzori / detail
  58. [graph grouping] added isLookupUrl to the workflow definition, passed to the grouping spark aciton — Claudio Atzori / detail
  59. avoid NPEs in Vocabulary.getTermBySynonym — Claudio Atzori / detail
  60. avoid NPEs — Claudio Atzori / detail
  61. avoid NPEs — Claudio Atzori / detail
  62. [bulktagging] fixed workflow parameters — Claudio Atzori / detail
  63. [community_organization propagation] fixed workflow parameters — Claudio Atzori / detail
  64. added serialization for the new fields imported for the Irish tender — Claudio Atzori / detail
  65. [dedup] added isLookupUrl to the graph consistency workflow definition, required now by the entity grouping phase — Claudio Atzori / detail
  66. [orcid enrichment] fixed workflow definition — Claudio Atzori / detail
  67. [bulktagging] setting first step of bulktaggin as the copy of the entities and relations not involved in the tagging' — Miriam Baglioni / detail
  68. [community_result_propagation] adjusting starting poit of workflow — Miriam Baglioni / detail
  69. [enrichment] passing the community API base URL — Claudio Atzori / detail
  70. logging typo — Claudio Atzori / detail
  71. [graph cleaning] avoid stack overflow error when navigating Oaf objects declaring an Enum — Claudio Atzori / detail
  72. code formatting — Claudio Atzori / detail
  73. [graph provision] added tests for the new model fields — Claudio Atzori / detail
  74. [cleaning] allow enriched orcids to pass the cleaning, rule out non-orcid author pids — Claudio Atzori / detail
  75. code formatting — Claudio Atzori / detail
  76. [graph provision] added tests for new peerreviewed field — Claudio Atzori / detail