Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement support for Message Templates #47

Open
rbjorklin opened this issue Sep 14, 2022 · 4 comments
Open

Implement support for Message Templates #47

rbjorklin opened this issue Sep 14, 2022 · 4 comments

Comments

@rbjorklin
Copy link

There was a request for structured logging in #24 which looks like it was implemented in a separate package as syslog.

It would be great if there was support for structured logging in a json format and it looks like Message Templates is akin to a spec that unifies human readable log messages with structured json messages which looks promising.

@dbuenzli dbuenzli changed the title [Feature Request] Implement support for Message Templates Implement support for Message Templates Sep 14, 2022
@dbuenzli
Copy link
Owner

Logs depends heavily on Format but you can add tags to your messages (which you reporter can then log to whatever you wish).

So what is your idea exactly ? What should logs do ?

@rbjorklin
Copy link
Author

Let's see if I can express myself in a somewhat understandable way 😃

I would like to see a reporter that can produce a structured message in json format based of a message template as shown in the message template overview.

Hypothetical scenario:

let user = "Santa Claus"
let location = "Rovaniemi"

let () = Logs_template.info (fun m -> m "{user} is currently located at {place_in_world}" user location)

(* The above would produce something along these lines
{
  "user": "Santa Claus",
  "place_in_world": "Rovaniemi",
  "timestamp": "2022-12-24T00:00:00Z",
  "template": "{user} is currently located at {place_in_world}"
}
*)

Having the template be part of the log message as the "type" would allow for easy filtering of a specific group of messages.

The same template format could also be used to produce human readable log messages. I could see a use-case where one might want to control the output while doing local dev work by doing --log-output=plain and then use --log-output=json for production.

Let me know If this sounds like it is out of scope for what you envision logs to be.

@dbuenzli
Copy link
Owner

Let me know If this sounds like it is out of scope for what you envision logs to be.

The problem is not about scope, the problem is rather how do you fit that in logs as it exists…

@cemerick
Copy link

cemerick commented May 1, 2023

@dbuenzli asked me to register a comment.

In recently needing to implement JSON-formatted structured logging, I came across this issue, and learned of the "message template" approach. It's something that I think I'll be moving towards over time, as it eliminates a bunch of duplication that's currently necessary to emit useful JSON logs with logs.

As it stands now, I'm using a single tag set for each logging event, which carries a (string * Yojson.Safe.t) list for easy integration into a larger composite Yojson.Safe.t value that is eventually written to stderr. Typical usage looks like this:

Log.info (fun m ->
    m
      ~tags:(meta_set [log_meta login; Domains.log_meta endpoint; RegionName.log_meta regionName])
      "Created branch transfer for person %d to branch %a, domain %a" person.id
      RegionName.pp regionName Domains.pp_endpoint endpoint);

Each of the log_meta functions produces a (string * Yojson.Safe.t) list corresponding to the respective modules' types; meta_set is a function that takes a list of such lists, flattens them, and produces a single Logs.Tag.set. Note though that there's a bunch of duplication between the tag set and the parameters to the format string; this is easily the most irritating aspect of logs usage now that we're emitting structured logs.

The actual JSON logging is pretty straightforward:

  (** A JSON format for use with [structured]. Intended for ingestion by JSON-compatible
  log sinks, not to be easily read by humans. *)
  let json_format out (`Assoc _ as v) =
    Format.(
      pp_print_string out @@ Yojson.Safe.to_string v;
      pp_print_newline out ())

  let structured format out =
    let report src level ~over k msgf =
      let timestamp = timestamp () in
      msgf
      @@ fun ?header ?tags fmt ->
      Format.(
        kasprintf
          (fun msg ->
            let meta = Option.(tags >>= Logs.Tag.find meta_d |> get_or ~default:[]) in
            let a =
              `Assoc
                ([ "time_t", `String timestamp ]
                @ (match level with
                  | Logs.App -> []
                  | lvl -> [ "level", `String (Logs.level_to_string (Some lvl)) ])
                @ (match Logs.Src.name src with
                  | "application" -> []
                  | src -> [ "src", `String src ])
                @ (match header with None -> [] | Some s -> [ "header", `String s ])
                @ meta
                @ [ "msg", `String msg ])
            in
            format out a;
            over ();
            k ())
          fmt)
    in
    { Logs.report }

Some of the parameters and such are badly/confusingly named at the moment; specifically, the first argument to structured is a function of Format.formatter -> [< ``Assoc of (string * Yojson.Safe.t) list ] -> unit that handles the actual redirection to a formatter. json_format is the one suitable for ~production environments that pull structured log messages into e.g. events and analytics services; there's an alternative that doesn't actually emit JSON, but JSON-esque strings that are pleasant enough for humans to read in textual ~development environments.


To answer this though:

The problem is not about scope, the problem is rather how do you fit that in logs as it exists…

I'd say that it's not clear that it should fit into logs, at least not as its currently conceived:

  • As useful as Format strings are when used for one-off string generation/interpolation (i.e. exactly the use case when outputting to stderr or equivalents), it's the wrong focus when the job to be done is shipping off a bag of heterogeneous data; positional interpolation is just painful in that context.
  • Related to that, the duplication at logging callsites of arranging tagged data and providing it to be formatted into the "message" string is bad on every level. It has generally doubled the linecount occupied by logging statements, which is ofc tedious to write, and pretty painful to read (I need to look up e.g. pattern-based folds in vscode so as to cope with the now decreased signal/noise ratio per screenful of code)
  • theoretically, those discrete bits of data that end up as slots in the emitted JSON message objects could be tags, but tags are just too much work to declare individually like that; a typical log message we emit now contains ~15 discrete kv pairs, and that number isn't going to shrink. I understand and appreciate the impulse behind making tags as typesafe as possible right up to the point when their printed representations hit the wire/channel, but sending off a bag of indistinguishable Yojson kv pairs is fine IMO. The worst case scenario is that an interpolation is missed in a message's template (something that is likely to be caught quite quickly in development, and if not, coped with pretty easily in ~prod, since the templated message shouldn't be conveying any important information on its own).

To be more constructive, it could work if a set of parallel logging functions grounded out in something like this:

type msgf_structured = ?⁠header:string -> ?⁠tags:Tag.set -> string -> (string * Yojson.Safe.t) list -> unit

The mandatory string argument would be a simple textual interpolation format, with values interpolated from the list of kvs.

At the end of the day, something like this and the existing Format-based logging facilities could definitely coexist given a functorized logs library so that the types associated with different msgf implementations could be adapted as necessary to common infrastructure.

But, all of this sort of presumes either breaking changes, or a drastic adjustment of logs' approach, so it's not clear to me that it should all sit under a single roof. (Or, perhaps that it's definitely not for me to suggest such changes in earnest.)

Hope that's helpful. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants