I'm Building a Question classifier for a question answering systems, and I want to know if anyone worked on something like this before. I've made a research on the state of the art algorithms to develop such component. most of them use machine learning with semantic features.

here's a list of the algorithms I've read

Question Classification using Support Vector Machines (Zhang, Lee)

Learning Question Classifiers (Li, Roth)

Question Classification with LogLinear Models (Phil Blunsom)

asked 09 Dec '12, 13:01

Mokhtar%20Ashour's gravatar image

Mokhtar Ashour
accept rate: 0%


I don't understand what you're after. You say you want to know if anyone else has worked on question classifiers, and then list three papers about that topic, which clearly indicates that yes, other people have worked on this.

This site is intended for questions that have a clear focus and can be answered precisely. My impression is that your question is more of an invitation to discussion, and as such, a discussion group (such as the semantic web mailinglist) might be more appropriate. See also http://answers.semanticweb.com/faq/#question-type-not

(09 Dec '12, 20:32) Jeen Broekstra ♦ Jeen%20Broekstra's gravatar image

@Jeen Broekstra I want to know if anyone tried these algorithms and worked with good precision.

(10 Dec '12, 10:20) Mokhtar Ashour Mokhtar%20Ashour's gravatar image

I'm inclined to believe that the results of those papers are substantially correct.

The thing that all three of them have in common, and the negation of which is almost always the case with failed semantic learning projects, is that they classify queries to a specific ontology for which extensive training and evaluation data is available.

With this data available it is straightforward to build classifiers and see how good results you can get. Researchers, therefore, are a lot like the drunk who keeps looking for his keys under the streetlight because that is where the light is.

If your questions are sampled from the same prior distribution as the TREC questions (or you can pretend so) and you like that classification, it really makes sense to choose something that works from the above papers that you feel comfortable with in terms of what you can do practically. You can pick up the same data they use and go.

If you want to classify some other space of questions with some other categories, the thing that you need to replicate from those papers is the methodology of creating test and evaluation data. That's a lot more fundamental than whatever machine learning algorithm you choose, or if you choose to develop heuristic rules by hand.

permanent link

answered 10 Dec '12, 16:24

database_animal's gravatar image

database_animal ♦
accept rate: 15%

edited 10 Dec '12, 16:24

@database_animal so, I can conclude that the algorithm will fit well, but I should carefully choose the data I train and test the system, right? and did you experience that in anyway. thanks in advance.

(10 Dec '12, 17:27) Mokhtar Ashour Mokhtar%20Ashour's gravatar image
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:


question asked: 09 Dec '12, 13:01

question was seen: 1,200 times

last updated: 10 Dec '12, 17:27