November 17, 2018

The usefulness of data

How do you represent something? This is very often the question we have to answer building our programs. We have a view of the world which we wish to encode into a system. How is this represented?

What if

What if we choose to use only the data to represent our problem? Here are some examples:

Hiccup

What if we used data to represent HTML? This is hiccup, where each HTML element is represented as a vector with a starting keyword denoting which tag is used, an optional map for attributes and everything following that are child elements.

    [:div
      [:h3 "This will be rendered as HTML"]
      [:ul#ul-id.pretty
        [:li {:style "font-weight: bold;"} "One of the most popular Clojure libraries out there"]
        [:li "You can find the library at " [:a {:href="https://github.com/weavejester/hiccup"} "GitHub"]]
        [:li "Since it is pure data and our functions always produce data we can use the full syntax of the language to generate our HTML"]
        (for [x (range 10)]
          [:li (str x (str/upper-case "times"))])
        [:li "I can also call any other functions I have defined myself"]
        [:li "Or put in something completely ad hoc"]
        (map (fn [x] [:li (str "Something random " (rand-int x))]) (range 100))]]

HoneySQL

What if we used data to represent SQL?

    {:select [:*]
     :from [:my_table]
     :where [:or [:and [:> :id 100]
                       [:< :id 200]]
                 [:= :id 0]]
     :limit 10}

Ring

What if we used data to represent HTTP?

    ;; request
    {:cookies {}
     :remote-addr "127.0.0.1"
     :params {:page "1"}
     :headers {"accept" "application/edn"
               "accept-encoding" "gzip, deflate, br"
               "accept-language" "sv-SE,sv;q=0.9,en-US;q=0.8,en;q=0.7,nb;q=0.6,da;q=0.5"
               "connection" "keep-alive"
               "host" "example.com"
               "referer" "http://example.com/"
               "user-agent" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"}
     :server-port 80
     :content-length 0
     :form-params {}
     :websocket? false
     :query-params {"page" "1", :page 1}
     :content-type nil
     :character-encoding "utf8"
     :uri "/"
     :server-name "example.com"
     :query-string "page=1"
     :body nil
     :scheme :http
     :request-method :get}

    ;; response
    {:status 200
     :headers {"Content-Type" "text/plain"}
     :body "Hello world"}

Yada + bidi

What if we used data to represent routes in HTTP and describe what they’re supposed to do?

["/store/"
 [ ; Vector containing our store's routes (bidi)
  ["index.html"
   {:summary "A list of the products we sell"
    :methods
    {:get
     {:response (file "index.html")
      :produces "text/html"}}}]
  ["cart"
   {:summary "Our visitor's shopping cart"
    :methods
    {:get
     {:response (fn [ctx] …)
      :produces #{"text/html" "application/json"}}
     :post
     {:response (fn [ctx] …)
      :produces #{"text/html" "application/json"}}}}]
  …
 ]]

The advantage of data

Using data to represent the world has a number of advantages. The first one is that data only represents itself. Data is easily represented as text and shows a full view of what of what the code is trying to do. In addition data can be turned into any kind of representation that you wish for, including easily sending it over the wire.

The second advantage is that data can be easily manipulated by the same functions/methods. No need to learn an entirely new syntax in order to manipulate a SQL query, or programatically read what has come in from a web browser in the HTTP request. The same standard library works for all those different contexts and data representations.

The third advantage is that data can be extended. Want to build something atop a library that someone wrote that uses data, but add your own little twist? Write the data in the syntax the original author made and add your modifications for your own code. Done.

Tags: data