Loading ...
Sorry, an error occurred while loading the content.
 

Re: [svg-developers] canonical expressions -- part 3: more efficient ways of packing text into rectangles

Expand Messages
  • ddailey
    The concept of how best to write something got me wondering about the following. Using an alphabet or a syllabary (like most of the languages of the world
    Message 1 of 3 , Nov 7, 2010
      The concept of "how best" to write something got me wondering about the following.

      Using an alphabet or a syllabary (like most of the languages of the world excepting Chinese, Japanese, Mayan, and a few hundred others) how much "space" does it take to convey our meaning.*

      Here's the question: if we relax the rules of English orthography just a bit, so that instead of writing from left to write, we write from left to right, or downward, or inward (by allowing glyphs to be "inside" one another) , can we write legibly in less space?

      http://granite.sru.edu/~ddailey/svg/canonical.svg

      This link shows a way of packing letters into a space under the relaxed rules of right-or-down-or-inside.

      If we confine legibility by some empirically defined threshold on the minimum size of a glyph, then if we allow physics to constrain the two dimensional placement of our glyphs, subject to rotation scaling and translation, to pack tightly, then can we find ways of expressing English (or another language using some alphabet) using less space than by writing simply unidirectionally?

      Vincent Hardy's work with cameras at http://svg-wow.org/blog/2010/08/14/camera/ reinforces this idea that writing need not be unidirectional. And from many languages we know that it need not be. By what grammar might we guide the maximization of our expressiveness per unit of space and time?

      cheers
      David

      * As a kid I subscribed to Quino Lingo and observed that English took up far less room, on average, that French, Spanish, Italian, Russian, German, Latin or Basque. I studied Navajo as a big kid and can testify that it takes up *room* to write it, though not so extravagantly as most languages. Chinese seems to be quite effective.

      ----- Original Message -----
      From: ddailey
      To: svg-developers@yahoogroups.com
      Sent: Sunday, November 07, 2010 11:31 PM
      Subject: Re: [svg-developers] canonical expressions -- part 2: A challenge: accessbility and symbols of the public domain (wikipedia)



      Challenge: come up with "better" symbols for signifying "public domain" or "copyright free."

      Begin here http://granite.sru.edu/~ddailey/svg/pd3.svg . Look at the source code and then see what you think. I'll get back to that example toward the end of this message.

      As a bit of searching in Google Images*, Wikipedia and Wikimedia Commons will reveal, there are several symbols meant to depict the concepts of "copyright free" or "public domain" or "copyleft." Not only do these concepts have slightly different nuances of meaning, but the symbols have a many-to-many relationship with the concepts. And furthermore, the symbols have differential levels of accessibity, depending on for whom we define "making" "allowing" or "enabling" to be accessible. And, many of the symbols, while looking alike, have very different underlying file structure.

      Following a recent visit to openclipart.org** I was rather prepared for what Jeff Schiller calls "cruft" when I saw the earlier image at http://commons.wikimedia.org/wiki/File:Publicdomain.svg
      as described there.I did the following [Hand edited to remove sodipodi and inkscape references, remove unused gradients, remove unused styles, replaced duplicated paths by <use> elements, simplified complex cubic beziers as simple arc subcommands; used integer arithmetic. Replaced complex arcs by circles. New file is 18 (<lkb) lines of code -- old file was 144 lines (>5kb). New file should have better semantics for re-editing basic objects.]

      Well 18 lines and 895 bytes defintely seems better than 5 kilobytes of code. But is the new code more accessible? Well, I think it is, but how can I tell for sure? How does one come up with the "best" expression for such a simple figure?

      Look inside the two figures and you'll see several questions that pose themselves:
      is it better to use <use>?
      does striking all the sodipodi stuff erase some of the artist's brushstrokes?***
      are two paths with one rotating the other better than one that has twice as coordinates listed?
      doesn't it make more sense to let color be inherited from the group rather than individually defined for each path?
      what about the optical illusion of the letters pd for public domain? Should that be made semantic in our markup?

      I confess it took me a while of fidding to replace all those cubic beziers from Inkscape by the "canonical" arc-equivalents. But I figure that the seven coordinates (or so) that I used, instead of sixty or so in the original path ought to make the content more accessible to future analysists if anyone ever wants to modify it!

      Next question (and maybe more important):

      Take a look at http://granite.sru.edu/~ddailey/svg/pd3.svg

      The image on the left is one of the current images served by wikimedia as the symbol for "copyright free".[2] Perhaps it is based on [3] . Perhaps the metadata associated with the file should show its ancestry?

      The file history shows some well-deserved attempt to rid the file of unneeded complexity and cruft.
      The current image (in its eleventh incarnation on wikipedia). It consists of four circles and three rectangles. One of the rectangles looks like it has been added merely to carve out a portion of a circle to make it look like a "c." This doesn't seem very accessible.

      So in my quick attempt, I put a "c" in the middle of the circle. I defined the circle as not two circles but one. I defined the rectangle as not two but one, and I defined the "C" as not two circles and a rectangle, but as a "c". I also made a stab at adding <title> and <desc> tags to describe the "why and what" of the file.

      So here is the challenge: can we come up with a better version of the symbol that what is there right now?

      Can we come up with one we will all agree is better?

      What I don't like about my attempt is that the "C" is dependent upon system fonts??? Changing from sans-serif to arial makes a huge difference in some browsers!

      Should the circle be one circle or two?

      Should the circle really be carved by a clipPath consisting of two arcs or should it be a circle with a line (rect) that crosses it? I chose a crossing line but was not convinced this was right.

      I stretched the "C" horizontally to make it appear to conform to the circle outside. Circles would have conformed better!

      What is the canonical <title> and <desc> information to go with the proper file?

      What is the proper way to refer to this discussion thread should we ever agree on my desire to replace the four circles and three rects

      cheers
      David

      *I discovered to my great dismay that Ditto.com, as of about 6 months ago, no longer exists[1]. Their lawsuit paved the way for Google images which followed almost to the day the initial ruling in favor of Ditto.com.

      **After spending a bit of time reminding myself of why I (wearing various hats that I do) don't use more images from http://www.openclipart.org/ , I wrote a bit of script to help me find the relevant <path> objects (amidst gradients and filters that are never used) that actually draw the interesting shapes (assigning mouseover event that parse and modify the "style" of the active object).

      *** I remember in the 1980's trying to help students (and a wife) recover their corrupted word-processed files and realizing that the archival copy actually contained not only the document but the "edit history" of the document including backspaces, deletes, copies and pastes!

      [1] http://srufaculty.sru.edu/david.dailey/copyright/legalthumb.htm
      [2] http://en.wikipedia.org/wiki/File:PD-icon.svg

      [3] http://en.wikipedia.org/wiki/File:Red_copyright.svg

      [Non-text portions of this message have been removed]





      [Non-text portions of this message have been removed]
    • Jacob Beard
      ... That s pretty interesting. I think there s a bit of work from the field of visual modelling that might be useful and relevant here. For one thing, it would
      Message 2 of 3 , Nov 8, 2010
        On 10-11-08 06:07 AM, ddailey wrote:
        >
        > The concept of "how best" to write something got me wondering about
        > the following.
        >
        > Using an alphabet or a syllabary (like most of the languages of the
        > world excepting Chinese, Japanese, Mayan, and a few hundred others)
        > how much "space" does it take to convey our meaning.*
        >
        > Here's the question: if we relax the rules of English orthography just
        > a bit, so that instead of writing from left to write, we write from
        > left to right, or downward, or inward (by allowing glyphs to be
        > "inside" one another) , can we write legibly in less space?
        >
        > http://granite.sru.edu/~ddailey/svg/canonical.svg
        > <http://granite.sru.edu/%7Eddailey/svg/canonical.svg>
        >
        > This link shows a way of packing letters into a space under the
        > relaxed rules of right-or-down-or-inside.
        >
        > If we confine legibility by some empirically defined threshold on the
        > minimum size of a glyph, then if we allow physics to constrain the two
        > dimensional placement of our glyphs, subject to rotation scaling and
        > translation, to pack tightly, then can we find ways of expressing
        > English (or another language using some alphabet) using less space
        > than by writing simply unidirectionally?
        >
        That's pretty interesting. I think there's a bit of work from the field
        of visual modelling that might be useful and relevant here. For one
        thing, it would probably be useful to formally define a notion of
        "insideness" in the language definition of your graphical language (the
        "abstract syntax of the concrete syntax", in use modelling parlance). In
        your language definition, you would probably say that each glyph may
        have some region in which other glyphs may be placed, and that doing so
        has some relation to the abstract syntax, or the structure of the
        language. You may also define some constraints in terms of layout in the
        language definition.

        You can see some similar work has been done here:
        http://msdl.cs.mcgill.ca/people/hv/teaching/MSBDesign/notes.ClassificationFrameworkVisualLanguages.pdf

        The author discusses classes of visual language, including
        "geometry-based languages", in which the meaning of the relationships
        between elements is primarily expressed in terms of their geometric
        properties (e.g. position relative to one another in the coordinate
        system, but I suppose this could be generalized). This includes a formal
        notion of "insideness" (see page 10, definition of ULinclude).

        Once you have formally defined the notion of "insideness" for your
        language, and have defined the special "inside" region for each element
        of your language (each glyph), and the special relationships between
        each element in the language, then it may be possible to begin applying
        existing layout algorithms, again perhaps from the domain of visual
        modelling languages. I'm thinking Harel's paper "An algorithm for blob
        hierarchy layout", while not completely relevant, might be an
        interesting place to start as a model for examining the efficacy of a
        particular layout for a particular graphical language:
        http://portal.acm.org/citation.cfm?id=345240

        There would be many ways of analyzing such algorithms, including
        usability/readability, and space-efficiency.

        Jake


        [Non-text portions of this message have been removed]
      • Dailey, David P.
        Wow! Very interesting papers Jake! I m very interested in visual languages and am pleased to know that there has been some work done in this area -- and it is
        Message 3 of 3 , Nov 8, 2010
          Wow! Very interesting papers Jake! I'm very interested in visual languages and am pleased to know that there has been some work done in this area -- and it is strong-looking work as well!

          One other vaguely related thing (but not so formally presented) was this from SVG Open 2007:

          "SVG Pictograms with Natural Language Based and Semantic Information by Kazunari" ITO et al available at
          http://www.svgopen.org/2007/papers/SVGOpen2007abstract/index.html -- they were sort of interested in making a language (that would be cross-culturally readable) out of juxtapositions and animations of familiar icons (there are remarkably many in international usage already).

          Thanks for the references Jake, I'm intrigued.

          David

          From: svg-developers@yahoogroups.com [mailto:svg-developers@yahoogroups.com] On Behalf Of Jacob Beard
          Sent: Monday, November 08, 2010 4:20 AM
          To: svg-developers@yahoogroups.com
          Subject: Re: [svg-developers] canonical expressions -- part 3: more efficient ways of packing text into rectangles



          On 10-11-08 06:07 AM, ddailey wrote:
          >
          > The concept of "how best" to write something got me wondering about
          > the following.
          >
          > Using an alphabet or a syllabary (like most of the languages of the
          > world excepting Chinese, Japanese, Mayan, and a few hundred others)
          > how much "space" does it take to convey our meaning.*
          >
          > Here's the question: if we relax the rules of English orthography just
          > a bit, so that instead of writing from left to write, we write from
          > left to right, or downward, or inward (by allowing glyphs to be
          > "inside" one another) , can we write legibly in less space?
          >
          > http://granite.sru.edu/~ddailey/svg/canonical.svg
          > <http://granite.sru.edu/%7Eddailey/svg/canonical.svg>
          >
          > This link shows a way of packing letters into a space under the
          > relaxed rules of right-or-down-or-inside.
          >
          > If we confine legibility by some empirically defined threshold on the
          > minimum size of a glyph, then if we allow physics to constrain the two
          > dimensional placement of our glyphs, subject to rotation scaling and
          > translation, to pack tightly, then can we find ways of expressing
          > English (or another language using some alphabet) using less space
          > than by writing simply unidirectionally?
          >
          That's pretty interesting. I think there's a bit of work from the field
          of visual modelling that might be useful and relevant here. For one
          thing, it would probably be useful to formally define a notion of
          "insideness" in the language definition of your graphical language (the
          "abstract syntax of the concrete syntax", in use modelling parlance). In
          your language definition, you would probably say that each glyph may
          have some region in which other glyphs may be placed, and that doing so
          has some relation to the abstract syntax, or the structure of the
          language. You may also define some constraints in terms of layout in the
          language definition.

          You can see some similar work has been done here:
          http://msdl.cs.mcgill.ca/people/hv/teaching/MSBDesign/notes.ClassificationFrameworkVisualLanguages.pdf

          The author discusses classes of visual language, including
          "geometry-based languages", in which the meaning of the relationships
          between elements is primarily expressed in terms of their geometric
          properties (e.g. position relative to one another in the coordinate
          system, but I suppose this could be generalized). This includes a formal
          notion of "insideness" (see page 10, definition of ULinclude).

          Once you have formally defined the notion of "insideness" for your
          language, and have defined the special "inside" region for each element
          of your language (each glyph), and the special relationships between
          each element in the language, then it may be possible to begin applying
          existing layout algorithms, again perhaps from the domain of visual
          modelling languages. I'm thinking Harel's paper "An algorithm for blob
          hierarchy layout", while not completely relevant, might be an
          interesting place to start as a model for examining the efficacy of a
          particular layout for a particular graphical language:
          http://portal.acm.org/citation.cfm?id=345240

          There would be many ways of analyzing such algorithms, including
          usability/readability, and space-efficiency.

          Jake

          [Non-text portions of this message have been removed]



          [Non-text portions of this message have been removed]
        Your message has been successfully submitted and would be delivered to recipients shortly.