blob: 00aa3e3c9aa5178649713f106e79155f680a8d6d [file] [log] [blame]
.. _tutorials.gettingstarted.usingdatastore:
Using the Datastore
===================
Storing data in a scalable web application can be tricky. A user could be
interacting with any of dozens of web servers at a given time, and the user's
next request could go to a different web server than the one that handled the
previous request. All web servers need to be interacting with data that is
also spread out across dozens of machines, possibly in different locations
around the world.
Thanks to Google App Engine, you don't have to worry about any of that.
App Engine's infrastructure takes care of all of the distribution, replication
and load balancing of data behind a simple API -- and you get a powerful
query engine and transactions as well.
The default datastore for an application is now the `High Replication datastore <http://code.google.com/appengine/docs/python/datastore/hr/>`_.
This datastore uses the `Paxos algorithm <http://labs.google.com/papers/paxos_made_live.html>`_
to replicate data across datacenters. The High Replication datastore is
extremely resilient in the face of catastrophic failure.
One of the consequences of this is that the consistency guarantee for the
datastore may differ from what you are familiar with. It also differs slightly
from the Master/Slave datastore, the other datastore option that App Engine
offers. In the example code comments, we highlight some ways this might affect
the design of your app. For more detailed information,
see `Using the High Replication Datastore <http://code.google.com/appengine/docs/python/datastore/hr/overview.html>`_
(HRD).
The datastore writes data in objects known as entities, and each entity has a
key that identifies the entity. Entities can belong to the same entity group,
which allows you to perform a single transaction with multiple entities.
Entity groups have a parent key that identifies the entire entity group.
In the High Replication Datastore, entity groups are also a unit of
consistency. Queries over multiple entity groups may return stale, `eventually consistent <http://en.wikipedia.org/wiki/Eventual_consistency>`_
results. Queries over a single entity group return up-to-date, strongly
consistent, results. Queries over a single entity group are called ancestor
queries. Ancestor queries use the parent key (instead of a specific entity's
key).
The code samples in this guide organize like entities into entity groups, and
use ancestor queries on those entity groups to return strongly consistent
results. In the example code comments, we highlight some ways this might affect
the design of your app. For more detailed information,
see `Using the High Replication Datastore <http://code.google.com/appengine/docs/python/datastore/hr/overview.html>`_.
A Complete Example Using the Datastore
--------------------------------------
Here is a new version of ``helloworld/helloworld.py`` that stores greetings
in the datastore. The rest of this page discusses the new pieces::
import cgi
import datetime
import urllib
import wsgiref.handlers
from google.appengine.ext import db
from google.appengine.api import users
import webapp2
class Greeting(db.Model):
"""Models an individual Guestbook entry with an author, content, and date."""
author = db.UserProperty()
content = db.StringProperty(multiline=True)
date = db.DateTimeProperty(auto_now_add=True)
def guestbook_key(guestbook_name=None):
"""Constructs a datastore key for a Guestbook entity with guestbook_name."""
return db.Key.from_path('Guestbook', guestbook_name or 'default_guestbook')
class MainPage(webapp2.RequestHandler):
def get(self):
self.response.out.write('<html><body>')
guestbook_name=self.request.get('guestbook_name')
# Ancestor Queries, as shown here, are strongly consistent with the High
# Replication datastore. Queries that span entity groups are eventually
# consistent. If we omitted the ancestor from this query there would be a
# slight chance that Greeting that had just been written would not show up
# in a query.
greetings = db.GqlQuery("SELECT * "
"FROM Greeting "
"WHERE ANCESTOR IS :1 "
"ORDER BY date DESC LIMIT 10",
guestbook_key(guestbook_name))
for greeting in greetings:
if greeting.author:
self.response.out.write(
'<b>%s</b> wrote:' % greeting.author.nickname())
else:
self.response.out.write('An anonymous person wrote:')
self.response.out.write('<blockquote>%s</blockquote>' %
cgi.escape(greeting.content))
self.response.out.write("""
<form action="/sign?%s" method="post">
<div><textarea name="content" rows="3" cols="60"></textarea></div>
<div><input type="submit" value="Sign Guestbook"></div>
</form>
<hr>
<form>Guestbook name: <input value="%s" name="guestbook_name">
<input type="submit" value="switch"></form>
</body>
</html>""" % (urllib.urlencode({'guestbook_name': guestbook_name}),
cgi.escape(guestbook_name)))
class Guestbook(webapp2.RequestHandler):
def post(self):
# We set the same parent key on the 'Greeting' to ensure each greeting is in
# the same entity group. Queries across the single entity group will be
# consistent. However, the write rate to a single entity group should
# be limited to ~1/second.
guestbook_name = self.request.get('guestbook_name')
greeting = Greeting(parent=guestbook_key(guestbook_name))
if users.get_current_user():
greeting.author = users.get_current_user()
greeting.content = self.request.get('content')
greeting.put()
self.redirect('/?' + urllib.urlencode({'guestbook_name': guestbook_name}))
application = webapp2.WSGIApplication([
('/', MainPage),
('/sign', Guestbook)
], debug=True)
def main():
application.RUN()
if __name__ == '__main__':
main()
Replace ``helloworld/helloworld.py`` with this, then reload
`http://localhost:8080/ <http://localhost:8080/>`_ in your browser. Post a
few messages to verify that messages get stored and displayed correctly.
Storing the Submitted Greetings
-------------------------------
App Engine includes a data modeling API for Python. It's similar to Django's
data modeling API, but uses App Engine's scalable datastore behind the scenes.
For the guestbook application, we want to store greetings posted by users.
Each greeting includes the author's name, the message content, and the date
and time the message was posted so we can display messages in chronological
order.
To use the data modeling API, import the ``google.appengine.ext.db`` module::
from google.appengine.ext import db
The following defines a data model for a greeting::
class Greeting(db.Model):
author = db.UserProperty()
content = db.StringProperty(multiline=True)
date = db.DateTimeProperty(auto_now_add=True)
This defines a ``Greeting`` model with three properties: ``author`` whose
value is a ``User`` object, content whose value is a string, and ``date`` whose
value is a ``datetime.datetime``.
Some property constructors take parameters to further configure their behavior.
Giving the ``db.StringProperty`` constructor the ``multiline=True`` parameter
says that values for this property can contain newline characters. Giving the
``db.DateTimeProperty`` constructor a ``auto_now_add=True`` parameter
configures the model to automatically give new objects a ``date`` of the time
the object is created, if the application doesn't otherwise provide a value.
For a complete list of property types and their options, see `the Datastore reference <http://code.google.com/appengine/docs/python/datastore/>`_.
Now that we have a data model for greetings, the application can use the model
to create new ``Greeting`` objects and put them into the datastore. The following
new version of the ``Guestbook`` handler creates new greetings and saves them
to the datastore::
class Guestbook(webapp2.RequestHandler):
def post(self):
guestbook_name = self.request.get('guestbook_name')
greeting = Greeting(parent=guestbook_key(guestbook_name))
if users.get_current_user():
greeting.author = users.get_current_user()
greeting.content = self.request.get('content')
greeting.put()
self.redirect('/?' + urllib.urlencode({'guestbook_name': guestbook_name}))
This new ``Guestbook`` handler creates a new ``Greeting`` object, then sets its
``author`` and ``content`` properties with the data posted by the user.
The parent has an entity kind "Guestbook". There is no need to create the
"Guestbook" entity before setting it to be the parent of another entity. In
this example, the parent is used as a placeholder for transaction and
consistency purposes. See `Entity Groups and Ancestor Paths <http://code.google.com/appengine/docs/python/datastore/entities.html#Entity_Groups_and_Ancestor_Paths>`_
for more information. Objects that share a common `ancestor <http://code.google.com/appengine/docs/python/datastore/queryclass.html#Query_ancestor>`_
belong to the same entity group. It does not set the date property, so date is
automatically set to "now," as we configured the model to do.
Finally, ``greeting.put()`` saves our new object to the datastore. If we had
acquired this object from a query, ``put()`` would have updated the existing
object. Since we created this object with the model constructor, ``put()`` adds
the new object to the datastore.
Because querying in the High Replication datastore is only strongly consistent
within entity groups, we assign all Greetings to the same entity group in this
example by setting the same parent for each Greeting. This means a user will
always see a Greeting immediately after it was written. However, the rate at
which you can write to the same entity group is limited to 1 write to the
entity group per second. When you design a real application you'll need to
keep this fact in mind. Note that by using services such as `Memcache <http://code.google.com/appengine/docs/python/memcache/>`_,
you can mitigate the chance that a user won't see fresh results when querying
across entity groups immediately after a write.
Retrieving the Stored Greetings With GQL
----------------------------------------
The App Engine datastore has a sophisticated query engine for data models.
Because the App Engine datastore is not a traditional relational database,
queries are not specified using SQL. Instead, you can prepare queries using a
SQL-like query language we call GQL. GQL provides access to the App Engine
datastore query engine's features using a familiar syntax.
The following new version of the ``MainPage`` handler queries the datastore
for greetings::
class MainPage(webapp2.RequestHandler):
def get(self):
self.response.out.write('<html><body>')
guestbook_name=self.request.get('guestbook_name')
greetings = db.GqlQuery("SELECT * "
"FROM Greeting "
"WHERE ANCESTOR IS :1 "
"ORDER BY date DESC LIMIT 10",
guestbook_key(guestbook_name))
for greeting in greetings:
if greeting.author:
self.response.out.write('<b>%s</b> wrote:' % greeting.author.nickname())
else:
self.response.out.write('An anonymous person wrote:')
self.response.out.write('<blockquote>%s</blockquote>' %
cgi.escape(greeting.content))
# Write the submission form and the footer of the page
self.response.out.write("""
<form action="/sign" method="post">
<div><textarea name="content" rows="3" cols="60"></textarea></div>
<div><input type="submit" value="Sign Guestbook"></div>
</form>
</body>
</html>""")
The query happens here::
greetings = db.GqlQuery("SELECT * "
"FROM Greeting "
"WHERE ANCESTOR IS :1 "
"ORDER BY date DESC LIMIT 10",
guestbook_key(guestbook_name))
Alternatively, you can call the ``gql(...)`` method on the ``Greeting`` class,
and omit the ``SELECT * FROM Greeting`` from the query::
greetings = Greeting.gql("WHERE ANCESTOR IS :1 ORDER BY date DESC LIMIT 10",
guestbook_key(guestbook_name))
As with SQL, keywords (such as ``SELECT``) are case insensitive. Names,
however, are case sensitive.
Because the query returns full data objects, it does not make sense to select
specific properties from the model. All GQL queries start with
``SELECT * FROM model`` (or are so implied by the model's ``gql(...)`` method)
so as to resemble their SQL equivalents.
A GQL query can have a ``WHERE`` clause that filters the result set by one or
more conditions based on property values. Unlike SQL, GQL queries may not
contain value constants: Instead, GQL uses parameter binding for all values
in queries. For example, to get only the greetings posted by the current user::
if users.get_current_user():
greetings = Greeting.gql(
"WHERE ANCESTOR IS :1 AND author = :2 ORDER BY date DESC",
guestbook_key(guestbook_name), users.get_current_user())
You can also use named parameters instead of positional parameters::
greetings = Greeting.gql("WHERE ANCESTOR = :ancestor AND author = :author ORDER BY date DESC",
ancestor=guestbook_key(guestbook_name), author=users.get_current_user())
In addition to GQL, the datastore API provides another mechanism for building
query objects using methods. The query above could also be prepared as follows::
greetings = Greeting.all()
greetings.ancestor(guestbook_key(guestbook_name))
greetings.filter("author =", users.get_current_user())
greetings.order("-date")
For a complete description of GQL and the query APIs, see the `Datastore reference <http://code.google.com/appengine/docs/python/datastore/>`_.
Clearing the Development Server Datastore
-----------------------------------------
The development web server uses a local version of the datastore for testing
your application, using temporary files. The data persists as long as the
temporary files exist, and the web server does not reset these files unless
you ask it to do so.
If you want the development server to erase its datastore prior to starting up,
use the ``--clear_datastore`` option when starting the server:
.. code-block:: text
dev_appserver.py --clear_datastore helloworld/
Next...
-------
We now have a working guest book application that authenticates users using
Google accounts, lets them submit messages, and displays messages other users
have left. Because App Engine handles scaling automatically, we will not need
to revisit this code as our application gets popular.
This latest version mixes HTML content with the code for the ``MainPage``
handler. This will make it difficult to change the appearance of the application,
especially as our application gets bigger and more complex. Let's use
templates to manage the appearance, and introduce static files for a CSS
stylesheet.
Continue to :ref:`tutorials.gettingstarted.templates`.