Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDC Adding server to processing state table to allow the processing script to run on multiple servers #11277

Merged
merged 20 commits into from
Mar 6, 2025
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
44f568f
adding server to processing state table to allow the processing scrip…
stevenwinship Feb 19, 2025
57f6a5a
style fix
stevenwinship Feb 19, 2025
692f555
new counter processor config yaml change for robots_url and machines_url
stevenwinship Feb 20, 2025
6dbedab
update docs with counter-processor 1.06
stevenwinship Feb 20, 2025
43953e8
add flyway script
stevenwinship Feb 25, 2025
aa8beec
Update doc/sphinx-guides/source/installation/prerequisites.rst
stevenwinship Feb 25, 2025
84280b2
Merge branch 'develop' into 11276-mdc-add-server-to-processing-state-…
stevenwinship Feb 26, 2025
7a0d6dc
check for server null or empty
stevenwinship Feb 26, 2025
30dd609
Merge branch 'develop' into 11276-mdc-add-server-to-processing-state-…
stevenwinship Feb 27, 2025
d8ca033
Merge branch 'develop' into 11276-mdc-add-server-to-processing-state-…
stevenwinship Mar 3, 2025
62530ec
renamed sql
stevenwinship Mar 3, 2025
7bef70d
undo sql change
stevenwinship Mar 4, 2025
e30e4a4
Merge branch 'develop' into 11276-mdc-add-server-to-processing-state-…
stevenwinship Mar 4, 2025
261eb96
Create V6.5.0.7.sql
stevenwinship Mar 4, 2025
562a206
undo change to sql script
stevenwinship Mar 5, 2025
4fe1538
Merge branch 'develop' into 11276-mdc-add-server-to-processing-state-…
stevenwinship Mar 5, 2025
20e8e4e
Merge branch 'develop' into 11276-mdc-add-server-to-processing-state-…
stevenwinship Mar 5, 2025
6dff3d1
rename sql
stevenwinship Mar 5, 2025
9746116
rename sql
stevenwinship Mar 5, 2025
77b059e
Merge branch 'develop' into 11276-mdc-add-server-to-processing-state-…
stevenwinship Mar 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,10 @@ path_types:

# Robots and machines urls are urls where the script can download a list of regular expressions to determine
# if something is a robot or machine user-agent. The text file has one regular expression per line
robots_url: https://raw.githubusercontent.com/CDLUC3/Make-Data-Count/master/user-agents/lists/robot.txt
machines_url: https://raw.githubusercontent.com/CDLUC3/Make-Data-Count/master/user-agents/lists/machine.txt
#robots_url: https://raw.githubusercontent.com/CDLUC3/Make-Data-Count/master/user-agents/lists/robot.txt
#machines_url: https://raw.githubusercontent.com/CDLUC3/Make-Data-Count/master/user-agents/lists/machine.txt
robots_url: https://raw.githubusercontent.com/IQSS/counter-processor/refs/heads/goto-gdcc/user-agents/lists/robots.txt
machines_url: https://raw.githubusercontent.com/IQSS/counter-processor/refs/heads/goto-gdcc/user-agents/lists/machine.txt

# the year and month for the report you are creating.
year_month: 2019-01
Expand Down
2 changes: 1 addition & 1 deletion doc/sphinx-guides/source/_static/util/counter_daily.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#! /bin/bash

COUNTER_PROCESSOR_DIRECTORY="/usr/local/counter-processor-1.05"
COUNTER_PROCESSOR_DIRECTORY="/usr/local/counter-processor-1.06"
MDC_LOG_DIRECTORY="/usr/local/payara6/glassfish/domains/domain1/logs/mdc"

# counter_daily.sh
Expand Down
6 changes: 3 additions & 3 deletions doc/sphinx-guides/source/admin/make-data-count.rst
Original file line number Diff line number Diff line change
Expand Up @@ -84,9 +84,9 @@ Configure Counter Processor

* Change to the directory where you installed Counter Processor.

* ``cd /usr/local/counter-processor-1.05``
* ``cd /usr/local/counter-processor-1.06``

* Download :download:`counter-processor-config.yaml <../_static/admin/counter-processor-config.yaml>` to ``/usr/local/counter-processor-1.05``.
* Download :download:`counter-processor-config.yaml <../_static/admin/counter-processor-config.yaml>` to ``/usr/local/counter-processor-1.06``.

* Edit the config file and pay particular attention to the FIXME lines.

Expand All @@ -99,7 +99,7 @@ Soon we will be setting up a cron job to run nightly but we start with a single

* Change to the directory where you installed Counter Processor.

* ``cd /usr/local/counter-processor-1.05``
* ``cd /usr/local/counter-processor-1.06``

* If you are running Counter Processor for the first time in the middle of a month, you will need create blank log files for the previous days. e.g.:

Expand Down
6 changes: 4 additions & 2 deletions doc/sphinx-guides/source/developers/make-data-count.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ Once you are done with your configuration, you can run Counter Processor like th

``su - counter``

``cd /usr/local/counter-processor-1.05``
``cd /usr/local/counter-processor-1.06``

``CONFIG_FILE=counter-processor-config.yaml python39 main.py``

Expand Down Expand Up @@ -82,7 +82,7 @@ Second, if you are also sending your SUSHI report to Make Data Count, you will n

``curl -H "Authorization: Bearer $JSON_WEB_TOKEN" -X DELETE https://$MDC_SERVER/reports/$REPORT_ID``

To get the ``REPORT_ID``, look at the logs generated in ``/usr/local/counter-processor-1.05/tmp/datacite_response_body.txt``
To get the ``REPORT_ID``, look at the logs generated in ``/usr/local/counter-processor-1.06/tmp/datacite_response_body.txt``

To read more about the Make Data Count api, see https://github.com/datacite/sashimi

Expand Down Expand Up @@ -110,9 +110,11 @@ The script will process the newest set of log files (merging files from multiple
APIs to manage the states include GET, POST, and DELETE (for testing), as shown below.

Note: ``yearMonth`` must be in the format ``yyyymm`` or ``yyyymmdd``.
Note: If running the new script on multiple servers add the query parameter &server=serverName on the first POST call. The server name can not be changed once set. To clear the name out you must delete the state and post a new one.

``curl -X GET http://localhost:8080/api/admin/makeDataCount/{yearMonth}/processingState``

``curl -X POST http://localhost:8080/api/admin/makeDataCount/{yearMonth}/processingState?state=processing&server=server1``
``curl -X POST http://localhost:8080/api/admin/makeDataCount/{yearMonth}/processingState?state=done``

``curl -X DELETE http://localhost:8080/api/admin/makeDataCount/{yearMonth}/processingState``
Expand Down
14 changes: 7 additions & 7 deletions doc/sphinx-guides/source/installation/prerequisites.rst
Original file line number Diff line number Diff line change
Expand Up @@ -438,9 +438,9 @@ A scripted installation using Ansible is mentioned in the :doc:`/developers/make
As root, download and install Counter Processor::

cd /usr/local
wget https://github.com/gdcc/counter-processor/archive/refs/tags/v1.05.tar.gz
tar xvfz v1.05.tar.gz
cd /usr/local/counter-processor-1.05
wget https://github.com/gdcc/counter-processor/archive/refs/tags/v1.06.tar.gz
tar xvfz v1.06.tar.gz
cd /usr/local/counter-processor-1.06

Installing GeoLite Country Database
===================================
Expand All @@ -451,7 +451,7 @@ The process required to sign up, download the database, and to configure automat

As root, change to the Counter Processor directory you just created, download the GeoLite2-Country tarball from MaxMind, untar it, and copy the geoip database into place::

<download or move the GeoLite2-Country.tar.gz to the /usr/local/counter-processor-1.05 directory>
<download or move the GeoLite2-Country.tar.gz to the /usr/local/counter-processor-1.06 directory>
tar xvfz GeoLite2-Country.tar.gz
cp GeoLite2-Country_*/GeoLite2-Country.mmdb maxmind_geoip

Expand All @@ -461,12 +461,12 @@ Creating a counter User
As root, create a "counter" user and change ownership of Counter Processor directory to this new user::

useradd counter
chown -R counter:counter /usr/local/counter-processor-1.05
chown -R counter:counter /usr/local/counter-processor-1.06

Installing Counter Processor Python Requirements
================================================

Counter Processor version 1.05 requires Python 3.7 or higher. This version of Python is available in many operating systems, and is purportedly available for RHEL7 or CentOS 7 via Red Hat Software Collections. Alternately, one may compile it from source.
Counter Processor version 1.06 requires Python 3.7 or higher. This version of Python is available in many operating systems, and is purportedly available for RHEL7 or CentOS 7 via Red Hat Software Collections. Alternately, one may compile it from source.

The following commands are intended to be run as root but we are aware that Pythonistas might prefer fancy virtualenv or similar setups. Pull requests are welcome to improve these steps!

Expand All @@ -477,7 +477,7 @@ Install Python 3.9::
Install Counter Processor Python requirements::

python3.9 -m ensurepip
cd /usr/local/counter-processor-1.05
cd /usr/local/counter-processor-1.06
pip3 install -r requirements.txt

See the :doc:`/admin/make-data-count` section of the Admin Guide for how to configure and run Counter Processor.
Expand Down
10 changes: 8 additions & 2 deletions src/main/java/edu/harvard/iq/dataverse/api/MakeDataCountApi.java
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,9 @@ public Response getProcessingState(@PathParam("yearMonth") String yearMonth) {
output.add("yearMonth", mdcps.getYearMonth());
output.add("state", mdcps.getState().name());
output.add("stateChangeTimestamp", mdcps.getStateChangeTime().toString());
if ( mdcps.getServer() != null) {
output.add("server", mdcps.getServer());
}
return ok(output);
} else {
return error(Status.NOT_FOUND, "Could not find an existing process state for " + yearMonth);
Expand All @@ -243,10 +246,10 @@ public Response getProcessingState(@PathParam("yearMonth") String yearMonth) {

@POST
@Path("{yearMonth}/processingState")
public Response updateProcessingState(@PathParam("yearMonth") String yearMonth, @QueryParam("state") String state) {
public Response updateProcessingState(@PathParam("yearMonth") String yearMonth, @QueryParam("state") String state, @QueryParam("server") String server) {
MakeDataCountProcessState mdcps;
try {
mdcps = makeDataCountProcessStateService.setMakeDataCountProcessState(yearMonth, state);
mdcps = makeDataCountProcessStateService.setMakeDataCountProcessState(yearMonth, state, server);
} catch (Exception e) {
return badRequest(e.getMessage());
}
Expand All @@ -255,6 +258,9 @@ public Response updateProcessingState(@PathParam("yearMonth") String yearMonth,
output.add("yearMonth", mdcps.getYearMonth());
output.add("state", mdcps.getState().name());
output.add("stateChangeTimestamp", mdcps.getStateChangeTime().toString());
if ( mdcps.getServer() != null) {
output.add("server", mdcps.getServer());
}
return ok(output);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,11 +42,14 @@ public String toString() {
private MDCProcessState state;
@Column(nullable = true)
private Timestamp stateChangeTimestamp;
@Column(nullable = true)
private String server;

public MakeDataCountProcessState() { }
public MakeDataCountProcessState (String yearMonth, String state) {
public MakeDataCountProcessState (String yearMonth, String state, String server) {
this.setYearMonth(yearMonth);
this.setState(state);
this.setServer(server);
}

public void setYearMonth(String yearMonth) throws IllegalArgumentException {
Expand All @@ -72,4 +75,10 @@ public MDCProcessState getState() {
public Timestamp getStateChangeTime() {
return stateChangeTimestamp;
}
public void setServer(String server) {
this.server = server;
}
public String getServer() {
return server;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,10 @@ public MakeDataCountProcessState getMakeDataCountProcessState(String yearMonth)
return mdcps;
}

public MakeDataCountProcessState setMakeDataCountProcessState(String yearMonth, String state) {
public MakeDataCountProcessState setMakeDataCountProcessState(String yearMonth, String state, String server) {
MakeDataCountProcessState mdcps = getMakeDataCountProcessState(yearMonth);
if (mdcps == null) {
mdcps = new MakeDataCountProcessState(yearMonth, state);
mdcps = new MakeDataCountProcessState(yearMonth, state, server);
} else {
mdcps.setState(state);
}
Expand Down
1 change: 1 addition & 0 deletions src/main/resources/db/migration/V6.5.0.5.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE makedatacountprocessstate ADD COLUMN IF NOT EXISTS server character varying(255) DEFAULT '';
Original file line number Diff line number Diff line change
Expand Up @@ -181,13 +181,14 @@ public void testMakeDataCountGetMetric() throws IOException {
@Test
public void testGetUpdateDeleteProcessingState() {
String yearMonth = "2000-01";
String server = "server1";
// make sure it isn't in the DB
Response deleteState = UtilIT.makeDataCountDeleteProcessingState(yearMonth);
deleteState.then().assertThat().statusCode(anyOf(equalTo(200), equalTo(404)));

Response getState = UtilIT.makeDataCountGetProcessingState(yearMonth);
getState.then().assertThat().statusCode(NOT_FOUND.getStatusCode());
Response updateState = UtilIT.makeDataCountUpdateProcessingState(yearMonth, MakeDataCountProcessState.MDCProcessState.PROCESSING.toString());
Response updateState = UtilIT.makeDataCountUpdateProcessingState(yearMonth, MakeDataCountProcessState.MDCProcessState.PROCESSING.toString(), server);
updateState.then().assertThat().statusCode(OK.getStatusCode());
getState = UtilIT.makeDataCountGetProcessingState(yearMonth);
getState.then().assertThat().statusCode(OK.getStatusCode());
Expand All @@ -196,14 +197,17 @@ public void testGetUpdateDeleteProcessingState() {
String state1 = stateJson.getString("data.state");
assertThat(state1, Matchers.equalTo(MakeDataCountProcessState.MDCProcessState.PROCESSING.name()));
String updateTimestamp1 = stateJson.getString("data.stateChangeTimestamp");
String updateServer1 = stateJson.getString("data.server");

updateState = UtilIT.makeDataCountUpdateProcessingState(yearMonth, MakeDataCountProcessState.MDCProcessState.DONE.toString());
updateState.then().assertThat().statusCode(OK.getStatusCode());
stateJson = JsonPath.from(updateState.body().asString());
stateJson.prettyPrint();
String state2 = stateJson.getString("data.state");
String updateTimestamp2 = stateJson.getString("data.stateChangeTimestamp");
String updateServer2 = stateJson.getString("data.server");
assertThat(state2, Matchers.equalTo(MakeDataCountProcessState.MDCProcessState.DONE.name()));
assertThat(updateServer2, Matchers.equalTo(updateServer1)); // once set the only way to remove the initial server name is to delete the state

assertThat(updateTimestamp2, Matchers.is(Matchers.greaterThan(updateTimestamp1)));

Expand Down
5 changes: 4 additions & 1 deletion src/test/java/edu/harvard/iq/dataverse/api/UtilIT.java
Original file line number Diff line number Diff line change
Expand Up @@ -3449,8 +3449,11 @@ static Response makeDataCountGetProcessingState(String yearMonth) {
return requestSpecification.get("/api/admin/makeDataCount/" + yearMonth + "/processingState");
}
static Response makeDataCountUpdateProcessingState(String yearMonth, String state) {
return makeDataCountUpdateProcessingState(yearMonth, state, null);
}
static Response makeDataCountUpdateProcessingState(String yearMonth, String state, String server) {
RequestSpecification requestSpecification = given();
return requestSpecification.post("/api/admin/makeDataCount/" + yearMonth + "/processingState?state=" + state);
return requestSpecification.post("/api/admin/makeDataCount/" + yearMonth + "/processingState?state=" + state + (server != null ? "&server=" + server : ""));
}
static Response makeDataCountDeleteProcessingState(String yearMonth) {
RequestSpecification requestSpecification = given();
Expand Down
Loading