Skip to content

Multiformat display function#3

Closed
flying-sheep wants to merge 7 commits intomasterfrom
multi-image
Closed

Multiformat display function#3
flying-sheep wants to merge 7 commits intomasterfrom
multi-image

Conversation

@flying-sheep
Copy link
Member

Hi, please voice your concerns about code organization/DRYness now, before i pull this in.

R/display.R Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure about this function. Can you pull this (and related changes like get_display_fun out into a separate PR so we can discuss them apart from the more mundane changes to formatting etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, basically this commit is the meat of it, the rest is format changes.

how can i “un-push” the other commit? or is it enough for you to be able to see that commit on its own?

@takluyver
Copy link
Member

Instead of defer, let's have a function called something like prepare_binary_data.

Code that wants to provide multiple formats in a single output should just call display directly for now, I think.

@flying-sheep
Copy link
Member Author

rather prepare_image_data, no? SVG isn’t binary, but also has to be sent here.

and i don’t get what you mean with your second line: display_images basically is the functionality needed to display multiple outputs. do you mean i should move display_images to IRkernel?

@takluyver
Copy link
Member

Width and height metadata don't have any effect on svg or pdf, in any case - they're only used for png and jpeg.

@takluyver
Copy link
Member

I don't think 'images' is a useful subcategory of outputs - as far as possible, anything we do should work for all output formats.

@flying-sheep
Copy link
Member Author

hmm, currently those are convenience functions to faciliate sending multiple plots.

if i get you right, you propose to

  1. remove width and height from everything but display_png (and move the metadata-modifying code to it)
  2. add all other formats to the possible formats (in the moment just HTML, which would then be another non-binary format like SVG)

@takluyver
Copy link
Member

I'd be inclined to do without the convenience code for now, and write the multiple-plot-format code where it's going to be used. Then we can later work out what convenience methods, if any, we want to add to this.

Besides the function names (display_png etc.), I'm inclined to keep this package using only explicit mimetypes, rather than abbreviations, so it shouldn't need maps of short names -> mimetypes or vice versa.

Unfortunately I don't have enough time to think through the API design carefully at the moment, because I'm getting ready to go to Pycon on Sunday.

@flying-sheep
Copy link
Member Author

hmm, maybe this works for you.

@takluyver
Copy link
Member

Can we put this on hold for the moment? I'll try to work out exactly what I want in a couple of weeks when I'm back from PyCon. In the meantime, I'd suggest making a PR for the functionality you want in IRkernel, without using any of this new convenience code, so we can see what that looks like.

flying-sheep added a commit to IRkernel/IRkernel that referenced this pull request Apr 2, 2015
@flying-sheep
Copy link
Member Author

no convenience code anymore. well, except from not using mimetypes but format names.

i’d really really like to see multiple formats in the kernel, and i think it would be much duplicated work to implement it in a different way.

of course it can wait as i can use local copies to get what i want.

@takluyver
Copy link
Member

What's wrong with using mimetypes? You can construct lists with mimetype keys: list('image/png'='foobar'). I think it's fine for the kernel to be specifying mimetypes rather than short names.

@flying-sheep
Copy link
Member Author

btw.: this is the change to the kernel code necessary to get multiple plot outputs based on this PR.

(the other changes are updates to namespace and the notebooks with the new content)

@flying-sheep
Copy link
Member Author

yeah, you’re right, but it’s getting too late for today.

are you leaving tomorrow? if so, have a nice trip!

PS: would that be the only thing needing change for you to accept both PRs?

@takluyver
Copy link
Member

I'm leaving on Sunday, but there's still some stuff I need to prepare ;-). Thanks!

I think I'd like to see an IRkernel PR that achieves the same without any change to IRdisplay for the time being. I don't think that's terribly difficult, the code should look something like this:

data = list()
for (mimetype in names(image_formats)) {
    format <- image_formats[[mimetype]]
    tf <- tempfile(fileext = format$extension)
    do.call(format$device, c(list(filename = tf), get_plot_options()))
    replayPlot(plotobj)
    dev.off()
    if (format$binary){
        data[mimetype] = base64encode(file)
    } else {
        data[mimetype] = readChar(file, ...)
    }
}
display(data)

That makes me think that the most important thing to add to IRdisplay is better knowledge about which mimetypes are binary, and perhaps a convenience wrapper to read a file as text/binary according to mimetype. But I haven't actually tried implementing that, so the implementation may turn up more things.

@jankatins
Copy link
Contributor

I think one of the problems here is mixing display_png with display a plot:

  • display_png() should get a png representation of an object (aka calling _repr_png_(obj)) and send that over the wire as a display_data message with a data dict which includes a text/plain and image/png(?) mimetype, but no more. It's a user callable function. (see IPython/core/display.py, but that looks like it also will move somewhere else during the next days as it is python specific)
  • display generated plots (or more general display any returned value) is sending the result of a user plotting some data over the wire, also via a display_data message. It computes the mimetypes it is asked to include (e.g. you can add and remove formatters (= repr_whatever functions) and all that return non-NULL are included). It's an internal thingy and not callable by the user and a thing of the kernel itself. For the python kernel, this is defined in ipython_kernel/displayhook.py and the matplotlib inline backend.
  • display() is also a user callable function which per default computes all mimetypes which are active (see "display a plot") and includes all in the data dict of a display_data message.
  • [There should be something about metadata? Haven't yet used that...]

If we want to model this more or less like the the python kernel, then we need:

  • a repr package, which defines some interface to get a representation for an object for multiple mimetypes. This package is generic and has no ipython/jupyter dependencies. The kernel depends on this package.
  • in the kernel, a list of active "formatter" (list of "active/not + mimetype name + repr_whatever functions") and something to manipulate this list (generic de-/activate them and some convenience function, which only de-/activates the image ones for plots)
  • in the kernel, a hook, which tries to format all returned objects via all active formatters and sends all non-NULL results as a executive_result.
  • in the kernel, some default repr functions for common objects (aka at least _repr_png_, _repr_svg_ and _repr_pdf_ for plots and a default "plain text" for all).
  • a display package, which implements display_png/...() (uses the formater in the kernel, which handles this mimetype, no matter if active or not) and display() (which uses all active formatters in the kernel list) + a internal publish_display_data(list-of-mimetype+data). This package depends on the kernel. For the user, this is optional.
  • in the kernel, a hook, which tries to format all created plots via all active formatters and sends all non-NULL results as a display_data (aka display(plot)).

[Updated with informations on how plots are send (-> with a "display_data" message and not via a "execution_result")]

@flying-sheep
Copy link
Member Author

i agree with most of this!

but i’d rather like to go the following route:

  1. get everything useful from this PR in.
  2. have usable multi-representation plots in IRkernel
  3. switch to the repr lib later. (repr_*.recordedplot for plots, instead of doing it in the kernel)

  • in the kernel, some default repr functions for common objects (aka at least repr_png, repr_svg and repr_pdf for plots and a default "plain text" for all).

why not use the ones from the repr lib?


i think display_multi with the keys changed to mimetypes has its place besides display1, as both have basically the same interface:

display1(mime, content, meta) vs display_multi(mime1 = list(content1, meta1), ...)

i could change the latter to

display_multi(list(mime1, content1, meta1), ...) or

display_multi(mimetypes = c(mime1, ...), content = list(content1, ...), metadata = list(meta1, ...))

@jankatins
Copy link
Contributor

Re where to put the Code: I think the "get plotting output out to the frontend" code should not be in IRdisplay at all but RKernel. If you start using the repr package now, it should actually work (minus content send, as the default formatter are not implemented yet), as the interface is there. Having the kernel side, implementing display is IMO much cleaner, as it all becomes something like

display <- function(obj, mimetype=NULL, raw=FALSE){
  if (raw and mimetype == NULL) stop("raw output without a specified mimetype doesn't work..."
 _send_display_data(obj, mimetype, raw)
}
display_png <- function(obj, raw=FALSE){
 _send_display_data(obj, "image/png", raw)
}
_send_display_data <- function(obj, mimetype=NULL, raw=FALSE){
  display_data = c() # I'm pretty sure there is a more R way of constructing a list in a loop :-)
  if (raw != NULL){
    if (mimetype == NULL) stop("raw output without a specified mimetype doesn't work..."
    display_data = c(c(mimetype, obj))
  }else if (mimetype is NULL){
      # this could be a function in the kernel, as this is more or less what happens 
      # for all returned objects in the execution_result message
      # get the list of all active formatters
      # for formatter in active:
      #    display_data = c(display_data, c(formatter$mimetype, formatter$repr(obj)))
  } else {
      # get the formatter, no matter if active or not 
      display_data = c(c(formatter$mimetype, formatter$repr(obj)))
 }
  # base64 encode all binary data, aka images mimetypes 
  # -> should probably be a function in the kernel, as it is needed there as well for 
  #     the execution_result message
  # do whatever it takes to send the data in a `display_data` message
}

[in python, there is also the possibility to send multiple objects at once, e.g. display(obj1, obj2) -> not sure how this can be implemented in R? with ...? it should end up as a loop in _send_display_data]

In the end, display* is only used by users, not by the kernel at all. The real "display machinery" (current wording for the IRdisplay package) is then in the repr and IRkernel packages.

Re "where to put repr_png": having some default implementation directly in the the repr package sounds actually better than having the implementations in the kernel, as these are incentives for other package (knitr?) to use it and gain such things for free :-)

@flying-sheep
Copy link
Member Author

so, i think this is it for now.

the better/repr version can come after this, but now we have

  1. display_png and friends working like they used to. we can change them, but i didn’t for now
  2. display_multi working with raw data and mimetypes, analogous to display1 (maybe i could rename it to display_alternatives?)
  3. the server presenting alternatives for plots, which is very useful if you use nbconvert

@flying-sheep
Copy link
Member Author

so what do you say? pull this, then flesh out and switch to repr?

@takluyver
Copy link
Member

Please hold off until I have time to dig in and review this, which probably won't be until after Pycon. Sorry for the delay.

flying-sheep added a commit to IRkernel/IRkernel that referenced this pull request Apr 9, 2015
flying-sheep added a commit to IRkernel/IRkernel that referenced this pull request Apr 9, 2015
@flying-sheep
Copy link
Member Author

so this is it: i basically maintained the API of the convenience functions here and added the mimetype-agnostic display_alternatives function to mirror display1.

IRkernel/IRkernel#82 depends on display_alternatives.

R/display.R Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be an alternative branch for display (if it returns a display_data reply): if the data is a list, use this path else the current display?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i’m going all python zen here: expicit is better than implicit, yet practicality beats purity.

this would hint at unifying them, but there’s one difference:

display_alternatives is more related to display1, because it knows about mimetypes. display is simply “do everything manually”, display1 and display_alternatives are “display things using mimetypes”, and display_* are convenience functions to save people from remembering all those mimetypes.

@takluyver
Copy link
Member

It looks like the only real difference between display() and display_alternatives() is that the latter takes the R equivalent of **kwargs, whereas you need to call display() with a list. But list() is fewer characters than _alternatives ;-). In any case, I think that publishing the raw mapping of mimetypes to data is not something that you should be doing very often, so I don't think we need a convenience function for it. We already found that we can easily live without it in IRkernel.

So with my decision-making hat on: let's drop display_alternatives from this PR and focus on reviewing the rest of what it provides.

We should, however, introduce a convenience function that takes an object and tries to call the various repr_foo methods on it, like handle_value in IRkernel does, so that the user can explicitly display an object without having to put it as an expression at the end of the cell. In IPython, that is IPython.display.display(). That can be a separate PR, though.

@jankatins jankatins mentioned this pull request Apr 26, 2015
10 tasks
@flying-sheep
Copy link
Member Author

abandoning in favor of #7 and #8

@flying-sheep flying-sheep deleted the multi-image branch April 26, 2015 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants