#1 Edited by aslambilal (11 posts) - - Show Bio

Hello. I've recently(maybe 3 or 4 days now?) started to learn ruby and rails and couldnt find a project to learn them with. So my firend texted me asking me to find him a free comic book organizer and I thought, "hey why dont i do that with rails?".


So here i am, i found your api and i'm starting to learn. I'm new to XML and APIs so i'll probably be bugging you guys a lot. Anyway here's my first question.

# Start Ruby Code
require 'rubygems'
require 'hpricot'
require 'open-uri'


url = "http://api.comicvine.com/characters/?api_key=(APIKEY)&gender=M&sort=birth_date&format=xml"
doc = Hpricot( open( url ) )

def extractcdata( input )
  input.to_s[9..(input.to_s.size-4)]
end

x = 0

(doc/:character).each do |c|
  x = x+1
  ['name'].each do |el|
    if c.at(el).innerHTML != "<![CDATA[]]>"
      puts "#{el}: #{extractcdata(c.at(el).innerHTML)}"
    end
  end
end

puts "X = " + x.to_s
# End Ruby Code

Output

name: The Legion of Super-Heroes
name: The Menace Of Dream Girl!
name: The War Between Supergirl and The Supermen Emergency Squad!
name: Hercules in the 20th Century!
name: The War Between Supergirl and The Supermen Emergency Squad!
name: The War Between Supergirl and The Supermen Emergency Squad!
name: Lana Lang and The Legion of Super-Heroes!
name: Escape of the Fatal Five!
name: The War Between Supergirl and The Supermen Emergency Squad!
name: The Confession of Superboy!
name: The Legion of Super-Heroes
name: Last Fight For A Legionnaire
name: The Boy With Ultra-Powers!
name: Tomorrow's Heroes Today!
name: One of Us Is a Traitor!
name: Hercules in the 20th Century!
name: The Legion of Super-Heroes
name: Barney
name: TNTNT
name: TNTNT
name: The Subject Is... Taboo!
name: Jumpstart!
name: Jumpstart!
name: Jumpstart!
name: Jumpstart!
name: Jumpstart!
name: TNTNT
name: TNTNT
name: Jumpstart!
name: Mangled!
name: The Subject Is... Taboo!
name: Teknight
name: Jumpstart!
name: Hey! Hugh! Get Off'a McCloud!
name: Rex Mundi
name: The Name Of The Game Is Fear!
name: The Name Of The Game Is Fear!
name: Rafferty
name: Jigglypuff
name: Jumpstart!
name: Homeboy!
name: Homeboy!
name: Homeboy!
name: Party Time!
name: Party Time!
name: Party Time!
name: The Face
name: Auro
name: Gale Allen
name: Futura
name: Hunt Bowman
name: Lyssa
name: Mars God Of War
name: Captain Science
name: Mysta Of The Moon
name: Bron II
name: Rocketman
name: The Bug
name: Flash
name: Hawkman
name: Shanghai Lil
name: Prof. Cosmos
name: Bulletman & Bulletgirl
name: Nightveil
name: Queen Reina
name: Proxima
name: Ms. Victory
name: Dragonfly
name: Haunted Horseman
name: The Good, The Bad, And The Paranormal
name: The Good, The Bad, And The Paranormal
name: Jason
name: Keyop
name: Mark
name: G Force to the Rescue!
name: Tiny
name: Director Anderson
name: General Andreas Tomak
name: Zoltar
name: Clayton Moore
name: Kay Aldridge
name: Nyoka
name: Betty Lou
name: Natsuki
name: General Sherman
name: Dragon In France During The Reign Of Louis The 16th
name: Louis The 16th
name: The Man In The Iron Mask
name: "The Fall Of Avalon"
name: Rance
name: Dragon In Ancient Egypt
X = 100
# End Output

So my question is, why do some names appear twice?
#2 Posted by mabster (22 posts) - - Show Bio

I think your code is simply iterating through each "name" element in the XML, regardless of whether it's the character's name or the name of something else within the data.

So you're outputting lines like this:
name: Jason
name: Keyop
name: Mark

... which are character names from "G Force to the Rescue!", but your code is also finding the "name" of that title (maybe it's the first appearance of one of the characters or something), so your code writes:

name: G Force to the Rescue!

I don't know Ruby, but presumably there's a way to say "just get me the <name> elements that are direct descendents of the <character> tag, rather than recursively searching down through the XML and finding all of them."

#3 Posted by aslambilal (11 posts) - - Show Bio

So i ended up trying a different xml parsing library and this gave me better resuls. I'm gonna tinker with the old one and see if i can get the same output later. Thanks for your reply Mabster


# Begin Code
require 'rexml/document'

def extractcdata( input )
  input.to_s[9..(input.to_s.size-4)]
end

xml = File.read( 'characters.xml' )

x = 0

doc, characters = REXML::Document.new( xml ), []
doc.elements.each( 'response/results/character/name' ) do |c|
  x = x + 1
  characters << extractcdata( c.innerHTML )
end

puts characters
puts x
# End Code
#4 Posted by mabster (22 posts) - - Show Bio

If you're learning Ruby (or any programming language) I can definitely recommend Stack Overflow.

#5 Posted by aslambilal (11 posts) - - Show Bio

Thanks :)


I've been using google and searching "ruby tutorial" "rails 2.x tutorial" or combinations. Not being able to find a job these days leaves me with too much spare time. I may start studying for the LSATs soon as well if nothing happens.
#6 Posted by aslambilal (11 posts) - - Show Bio

Code Update:


#BEGIN

require 'rexml/document'
require 'net/http'
require 'open-uri'

def extractcdata( input )
  input.to_s[7..(input.to_s.size-3)]
end

url = "http://api.comicvine.com/characters/?api_key=(API)&sort=name&format=xml"

xml = open( url )

# xml = File.read( 'characters.xml' )

x = 0

regex = Regexp.new(  /\[.*\]/ )

doc, characters = REXML::Document.new( xml ), []
doc.elements.each( 'response/results/character/name' ) do |name|
  x = x + 1
  matchname = regex.match( name.to_s )
  if matchname
    characters << extractcdata( matchname[0] )
  else
    puts "NO MATCH: " + name
  end
end

puts characters
puts x

#END

Comment - Output is not sorted by name of character. Any idea why?

#OUTPUT
Lightning Lad
Dream Girl
Brainiac 5
Invisible Kid
Phantom Girl
Sun Boy
Star Boy
Shadow Lass
Triplicate Girl
Element Lad
Cosmic Boy
Micro Lad
...
#END