kruize · dinogun · Aug 22, 2023 · Aug 21, 2023 · Jul 21, 2023 · Jul 22, 2023
diff --git a/design/KruizePromQL.md b/design/KruizePromQL.md
@@ -10,6 +10,7 @@ The following are the available Kruize APIs that you can monitor:
 - `listRecommendations` (GET): API for listing recommendations.
 - `listExperiments` (GET): API for listing experiments.
 - `updateResults` (POST): API for updating experiment results.
+- `updateRecommendations` (POST): API for updating recommendations for an experiment.
 
 ## Time taken for KruizeAPI metrics
 
@@ -21,14 +22,15 @@ To monitor the performance of these APIs, you can use the following metrics:
 
 Here are some sample metrics for the mentioned APIs which can run in Prometheus:
 
-- `kruizeAPI_seconds_count{api="createExperiment", application="Kruize", method="POST"}`: Returns the count of invocations for the `createExperiment` API.
-- `kruizeAPI_seconds_sum{api="createExperiment", application="Kruize", method="POST"}`: Returns the sum of the time taken by the `createExperiment` API.
-- `kruizeAPI_seconds_max{api="createExperiment", application="Kruize", method="POST"}`: Returns the maximum time taken by the `createExperiment` API.
+- `kruizeAPI_seconds_count{api="createExperiment", application="Kruize", method="POST", status="success"}`: Returns the count of successful invocations for the `createExperiment` API.
+- `kruizeAPI_seconds_count{api="createExperiment", application="Kruize", method="POST", status="failure"}`: Returns the count of failed invocations for the `createExperiment` API.
+- `kruizeAPI_seconds_sum{api="createExperiment", application="Kruize", method="POST", status="success"}`: Returns the sum of the time taken by the successful invocations of `createExperiment` API.
+- `kruizeAPI_seconds_max{api="createExperiment", application="Kruize", method="POST", status="success"}`: Returns the maximum time taken by the successful invocation of `createExperiment` API.
 
 By changing the value of the `api` and `method` label, you can gather metrics for other Kruize APIs such as `listRecommendations`, `listExperiments`, and `updateResults`.
 
 Here is a sample command to collect the metric through `curl`
-- `curl --silent -G -kH "Authorization: Bearer ${TOKEN}" --data-urlencode 'query=kruizeAPI_seconds_sum{api="listRecommendations", application="Kruize", method="GET"}' ${PROMETHEUS_URL} | jq` : 
+- `curl --silent -G -kH "Authorization: Bearer ${TOKEN}" --data-urlencode 'query=kruizeAPI_seconds_sum{api="listRecommendations", application="Kruize", method="GET", status="success"}' ${PROMETHEUS_URL} | jq` : 
 Returns the sum of the time taken by `listRecommendations` API.
 
 Sample Output:
@@ -64,15 +66,17 @@ Sample Output:
 
 The following are the available Kruize DB methods that you can monitor:
 
-- `addRecommendationToDB`: Method for adding a recommendation to the database.
-- `addResultsToDB`: Method for adding experiment results to the database.
-- `loadAllRecommendations`: Method for loading all recommendations from the database.
-- `loadAllExperiments`: Method for loading all experiments from the database.
 - `addExperimentToDB`: Method for adding an experiment to the database.
-- `loadResultsByExperimentName`: Method for loading experiment results by experiment name.
+- `addResultToDB`: Method for adding experiment results to the database.
+- `addBulkResultsToDBAndFetchFailedResults`: Method for adding bulk experiment results to the database and fetch the failed results.
+- `addRecommendationToDB`: Method for adding a recommendation to the database.
 - `loadExperimentByName`: Method for loading an experiment by name.
-- `loadAllResults`: Method for loading all experiment results from the database.
+- `loadResultsByExperimentName`: Method for loading experiment results by experiment name.
 - `loadRecommendationsByExperimentName`: Method for loading recommendations by experiment name.
+- `loadRecommendationsByExperimentNameAndDate`: Method for loading recommendations by experiment name and date.
+- `addPerformanceProfileToDB`: Method to add performance profile to the database.
+- `loadPerformanceProfileByName`: Method to load a specific performance profile.
+- `loadAllPerformanceProfiles`: Method to load all performance profiles.
 
 ## Time taken for KruizeDB metrics
 
@@ -84,14 +88,15 @@ To monitor the performance of these methods, you can use the following metrics:
 
 Here are some sample metrics for the mentioned DB methods which can run in Prometheus:
 
-- `kruizeDB_seconds_count{application="Kruize", method="loadAllExperiments"}`: Number of times the `loadAllExperiments` method was called.
-- `kruizeDB_seconds_sum{application="Kruize", method="loadAllExperiments"}`: Total time taken by the `loadAllExperiments` method.
-- `kruizeDB_seconds_max{application="Kruize", method="loadAllExperiments"}`: Maximum time taken by the `loadAllExperiments` method.
+- `kruizeDB_seconds_count{application="Kruize", method="addExperimentToDB", status="success"}`: Number of successful invocations of `addExperimentToDB` method.
+- `kruizeDB_seconds_count{application="Kruize", method="addExperimentToDB", status="failure"}`: Number of failed invocations of `addExperimentToDB` method.
+- `kruizeDB_seconds_sum{application="Kruize", method="addExperimentToDB", status="success"}`: Total time taken by the `addExperimentToDB` method which were success.
+- `kruizeDB_seconds_max{application="Kruize", method="addExperimentToDB", status="success"}`: Maximum time taken by the `addExperimentToDB` method which were success.
 
 By changing the value of the `method` label, you can gather metrics for other KruizeDB metrics.
 
 Here is a sample command to collect the metric through `curl`
-- `curl --silent -G -kH "Authorization: Bearer ${TOKEN}" --data-urlencode 'query=kruizeDB_seconds_sum{method="loadRecommendationsByExperimentName"}' ${PROMETHEUS_URL} | jq` :
+- `curl --silent -G -kH "Authorization: Bearer ${TOKEN}" --data-urlencode 'query=kruizeDB_seconds_sum{application="Kruize", method="loadRecommendationsByExperimentName", status="success"}' ${PROMETHEUS_URL} | jq` :
   Returns the sum of the time taken by `loadRecommendationsByExperimentName` method.
 
 Sample Output:

diff --git a/src/main/java/com/autotune/analyzer/services/CreateExperiment.java b/src/main/java/com/autotune/analyzer/services/CreateExperiment.java
@@ -80,6 +80,7 @@ public void init(ServletConfig config) throws ServletException {
 
     @Override
     protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
+        String statusValue = "failure";
         Timer.Sample timerCreateExp = Timer.start(MetricsConfig.meterRegistry());
         Map<String, KruizeObject> mKruizeExperimentMap = new ConcurrentHashMap<String, KruizeObject>();;
         try {
@@ -112,8 +113,10 @@ protected void doPost(HttpServletRequest request, HttpServletResponse response)
                         ExperimentDAO experimentDAO = new ExperimentDAOImpl();
                         addedToDB = new ExperimentDBService().addExperimentToDB(validAPIObj);
                     }
-                    if (addedToDB.isSuccess())
+                    if (addedToDB.isSuccess()) {
                         sendSuccessResponse(response, "Experiment registered successfully with Kruize.");
+                        statusValue = "success";
+                    }
                     else {
                         sendErrorResponse(response, null, HttpServletResponse.SC_BAD_REQUEST, addedToDB.getMessage());
                     }
@@ -127,7 +130,10 @@ protected void doPost(HttpServletRequest request, HttpServletResponse response)
             LOGGER.error("Unknown exception caught: " + e.getMessage());
             sendErrorResponse(response, e, HttpServletResponse.SC_INTERNAL_SERVER_ERROR, "Internal Server Error: " + e.getMessage());
         } finally {
-            if (null != timerCreateExp) timerCreateExp.stop(MetricsConfig.timerCreateExp);
+            if (null != timerCreateExp) {
+                MetricsConfig.timerCreateExp = MetricsConfig.timerBCreateExp.tag("status", statusValue).register(MetricsConfig.meterRegistry());
+                timerCreateExp.stop(MetricsConfig.timerCreateExp);
+            }
         }
     }
 

diff --git a/src/main/java/com/autotune/analyzer/services/ListExperiments.java b/src/main/java/com/autotune/analyzer/services/ListExperiments.java
@@ -102,6 +102,7 @@ public void init(ServletConfig config) throws ServletException {
 
     @Override
     protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
+        String statusValue = "failure";
         Timer.Sample timerListExp = Timer.start(MetricsConfig.meterRegistry());
         response.setStatus(HttpServletResponse.SC_OK);
         response.setContentType(JSON_CONTENT_TYPE);
@@ -120,63 +121,69 @@ protected void doGet(HttpServletRequest request, HttpServletResponse response) t
                 invalidParams.add(param);
             }
         }
-        if (invalidParams.isEmpty()) {
-            // Set default values if absent
-            if (results == null || results.isEmpty())
-                results = "false";
-            if (recommendations == null || recommendations.isEmpty())
-                recommendations = "false";
-            if (latest == null || latest.isEmpty())
-                latest = "true";
-            // Validate query parameter values
-            if (isValidBooleanValue(results) && isValidBooleanValue(recommendations) && isValidBooleanValue(latest)) {
-                try {
-                    // Fetch experiments data from the DB and check if the requested experiment exists
-                    loadExperimentsFromDatabase(mKruizeExperimentMap, experimentName);
-                    // Check if experiment exists
-                    if (experimentName != null && !mKruizeExperimentMap.containsKey(experimentName)) {
-                        error = true;
-                        sendErrorResponse(
-                                response,
-                                new Exception(AnalyzerErrorConstants.APIErrors.ListRecommendationsAPI.INVALID_EXPERIMENT_NAME_EXCPTN),
-                                HttpServletResponse.SC_BAD_REQUEST,
-                                String.format(AnalyzerErrorConstants.APIErrors.ListRecommendationsAPI.INVALID_EXPERIMENT_NAME_MSG, experimentName)
-                        );
-                    }
-                    if (!error) {
-                        // create Gson Object
-                        Gson gsonObj = createGsonObject();
-
-                        // Modify the JSON response here based on query params.
-                        gsonStr = buildResponseBasedOnQuery(mKruizeExperimentMap, gsonObj, results, recommendations, latest, experimentName);
-                        if (gsonStr.isEmpty()) {
-                            gsonStr = generateDefaultResponse();
+        try {
+            if (invalidParams.isEmpty()) {
+                // Set default values if absent
+                if (results == null || results.isEmpty())
+                    results = "false";
+                if (recommendations == null || recommendations.isEmpty())
+                    recommendations = "false";
+                if (latest == null || latest.isEmpty())
+                    latest = "true";
+                // Validate query parameter values
+                if (isValidBooleanValue(results) && isValidBooleanValue(recommendations) && isValidBooleanValue(latest)) {
+                    try {
+                        // Fetch experiments data from the DB and check if the requested experiment exists
+                        loadExperimentsFromDatabase(mKruizeExperimentMap, experimentName);
+                        // Check if experiment exists
+                        if (experimentName != null && !mKruizeExperimentMap.containsKey(experimentName)) {
+                            error = true;
+                            sendErrorResponse(
+                                    response,
+                                    new Exception(AnalyzerErrorConstants.APIErrors.ListRecommendationsAPI.INVALID_EXPERIMENT_NAME_EXCPTN),
+                                    HttpServletResponse.SC_BAD_REQUEST,
+                                    String.format(AnalyzerErrorConstants.APIErrors.ListRecommendationsAPI.INVALID_EXPERIMENT_NAME_MSG, experimentName)
+                            );
+                        }
+                        if (!error) {
+                            // create Gson Object
+                            Gson gsonObj = createGsonObject();
+
+                            // Modify the JSON response here based on query params.
+                            gsonStr = buildResponseBasedOnQuery(mKruizeExperimentMap, gsonObj, results, recommendations, latest, experimentName);
+                            if (gsonStr.isEmpty()) {
+                                gsonStr = generateDefaultResponse();
+                            }
+                            response.getWriter().println(gsonStr);
+                            response.getWriter().close();
+                            statusValue = "success";
                         }
-                        response.getWriter().println(gsonStr);
-                        response.getWriter().close();
+                    } catch (Exception e) {
+                        LOGGER.error("Exception: " + e.getMessage());
+                        e.printStackTrace();
+                        sendErrorResponse(response, e, HttpServletResponse.SC_INTERNAL_SERVER_ERROR, e.getMessage());
                     }
-                } catch (Exception e) {
-                    LOGGER.error("Exception: " + e.getMessage());
-                    e.printStackTrace();
-                    sendErrorResponse(response, e, HttpServletResponse.SC_INTERNAL_SERVER_ERROR, e.getMessage());
-                } finally {
-                    if (null != timerListExp) timerListExp.stop(MetricsConfig.timerListExp);
+                } else {
+                    sendErrorResponse(
+                            response,
+                            new Exception(AnalyzerErrorConstants.APIErrors.ListRecommendationsAPI.INVALID_QUERY_PARAM_VALUE),
+                            HttpServletResponse.SC_BAD_REQUEST,
+                            String.format(AnalyzerErrorConstants.APIErrors.ListRecommendationsAPI.INVALID_QUERY_PARAM_VALUE)
+                    );
                 }
             } else {
                 sendErrorResponse(
                         response,
-                        new Exception(AnalyzerErrorConstants.APIErrors.ListRecommendationsAPI.INVALID_QUERY_PARAM_VALUE),
+                        new Exception(AnalyzerErrorConstants.APIErrors.ListRecommendationsAPI.INVALID_QUERY_PARAM),
                         HttpServletResponse.SC_BAD_REQUEST,
-                        String.format(AnalyzerErrorConstants.APIErrors.ListRecommendationsAPI.INVALID_QUERY_PARAM_VALUE)
+                        String.format(AnalyzerErrorConstants.APIErrors.ListRecommendationsAPI.INVALID_QUERY_PARAM, invalidParams)
                 );
             }
-        } else {
-            sendErrorResponse(
-                    response,
-                    new Exception(AnalyzerErrorConstants.APIErrors.ListRecommendationsAPI.INVALID_QUERY_PARAM),
-                    HttpServletResponse.SC_BAD_REQUEST,
-                    String.format(AnalyzerErrorConstants.APIErrors.ListRecommendationsAPI.INVALID_QUERY_PARAM, invalidParams)
-            );
+        } finally {
+            if (null != timerListExp) {
+                MetricsConfig.timerListExp = MetricsConfig.timerBListExp.tag("status", statusValue).register(MetricsConfig.meterRegistry());
+                timerListExp.stop(MetricsConfig.timerListExp);
+            }
         }
     }
 

diff --git a/src/main/java/com/autotune/analyzer/services/ListRecommendations.java b/src/main/java/com/autotune/analyzer/services/ListRecommendations.java
@@ -73,6 +73,7 @@ public void init(ServletConfig config) throws ServletException {
 
     @Override
     protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
+        String statusValue = "failure";
         Timer.Sample timerListRec = Timer.start(MetricsConfig.meterRegistry());
         response.setContentType(JSON_CONTENT_TYPE);
         response.setCharacterEncoding(CHARACTER_ENCODING);
@@ -197,6 +198,7 @@ protected void doGet(HttpServletRequest request, HttpServletResponse response) t
                                                                             checkForTimestamp,
                                                                             monitoringEndTimestamp);
                         recommendationList.add(listRecommendationsAPIObject);
+                        statusValue = "success";
                     } catch (Exception e) {
                         LOGGER.error("Not able to generate recommendation for expName : {} due to {}", ko.getExperimentName(), e.getMessage());
                     }
@@ -233,7 +235,10 @@ public boolean shouldSkipClass(Class<?> clazz) {
             e.printStackTrace();
             sendErrorResponse(response, e, HttpServletResponse.SC_INTERNAL_SERVER_ERROR, e.getMessage());
         } finally {
-            if (null != timerListRec) timerListRec.stop(MetricsConfig.timerListRec);
+            if (null != timerListRec) {
+                MetricsConfig.timerListRec = MetricsConfig.timerBListRec.tag("status", statusValue).register(MetricsConfig.meterRegistry());
+                timerListRec.stop(MetricsConfig.timerListRec);
+            }
         }
     }
 

diff --git a/src/main/java/com/autotune/analyzer/services/UpdateRecommendations.java b/src/main/java/com/autotune/analyzer/services/UpdateRecommendations.java
@@ -37,6 +37,7 @@
 import io.micrometer.core.instrument.Timer;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
+import io.micrometer.core.instrument.Timer;
 
 import javax.servlet.ServletConfig;
 import javax.servlet.ServletException;
@@ -79,6 +80,8 @@ public void init(ServletConfig config) throws ServletException {
      */
     @Override
     protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
+        String statusValue = "failure";
+        Timer.Sample timerBUpdateRecommendations = Timer.start(MetricsConfig.meterRegistry());
         try {
             // Get the values from the request parameters
             String experiment_name = request.getParameter(KruizeConstants.JSONKeys.EXPERIMENT_NAME);
@@ -163,12 +166,11 @@ protected void doPost(HttpServletRequest request, HttpServletResponse response)
                     KruizeObject kruizeObject = mainKruizeExperimentMAP.get(experiment_name);
                     new ExperimentInitiator().generateAndAddRecommendations(kruizeObject, experimentResultDataList, interval_start_time, interval_end_time);
                     ValidationOutputData validationOutputData = new ExperimentDBService().addRecommendationToDB(mainKruizeExperimentMAP, experimentResultDataList);
-                    if (validationOutputData.isSuccess())
+                    if (validationOutputData.isSuccess()) {
                         sendSuccessResponse(response, kruizeObject, interval_end_time);
-                    else {
-
+                        statusValue = "success";
+                    } else {
                         sendErrorResponse(response, null, HttpServletResponse.SC_INTERNAL_SERVER_ERROR, validationOutputData.getMessage());
-
                     }
                 } catch (Exception e) {
                     e.printStackTrace();
@@ -188,6 +190,11 @@ protected void doPost(HttpServletRequest request, HttpServletResponse response)
             LOGGER.error("Exception: " + e.getMessage());
             e.printStackTrace();
             sendErrorResponse(response, e, HttpServletResponse.SC_INTERNAL_SERVER_ERROR, e.getMessage());
+        } finally {
+            if (null != timerBUpdateRecommendations) {
+                MetricsConfig.timerUpdateRecomendations = MetricsConfig.timerBUpdateRecommendations.tag("status", statusValue).register(MetricsConfig.meterRegistry());
+                timerBUpdateRecommendations.stop(MetricsConfig.timerUpdateRecomendations);
+            }
         }
     }