تبليغاتX
گرید کامپیوتینگ-Grid computing

گرید کامپیوتینگ-Grid computing

سایت تخصصی در مورد گرید کامپیوتینگ-grid computing-globus و سیستم های توزیع شده و علوم جديد كامپيوتر

سوپر کامپیوتر خانگی

سوپر کامپیوتر خانگی

سوپر کامپیوتر خانگی نرم افزار OpenMosix شبکه ای از کامپیوتر های گنو/لینوکس را تبدیل به یک کلاستر می کند. حفظ تعادل بار بین گره ها به صورت اتوماتیک انجام می شود. گره ها می توانند عضو کلاستر شوند یا از گروه خارج شوند بدون این که وقفه ای در سیستم به وجود بیاید. آدرس : http://openmosix.sourceforge.net
+ نوشته شده در  شنبه نوزدهم آبان 1386ساعت 15:56  توسط یوسف عبدلیان باریکرسفی  | 

سوپر کامپیوتر خانگی

سوپر کامپیوتر خانگی

سوپر کامپیوتر خانگی نرم افزار OpenMosix شبکه ای از کامپیوتر های گنو/لینوکس را تبدیل به یک کلاستر می کند. حفظ تعادل بار بین گره ها به صورت اتوماتیک انجام می شود. گره ها می توانند عضو کلاستر شوند یا از گروه خارج شوند بدون این که وقفه ای در سیستم به وجود بیاید. آدرس : http://openmosix.sourceforge.net
+ نوشته شده در  شنبه نوزدهم آبان 1386ساعت 15:56  توسط یوسف عبدلیان باریکرسفی  | 

در جستجوی ماشین مجازی ناهمگن موازی

در جستجوی ماشین مجازی ناهمگن موازی

در جستجوی ماشین مجازی ناهمگن موازی قیمت سخت افزار و سیستم های "بازمانده" -Legacy با سرعت زیادی کاهش می یابد.، این روز ها می توانید سرورهایی با چهار پردازنده پنتیوم 3 را در سایت Ebay.com با قیمت زیر هشتصد دلار بخرید. ماشین های سری ایندیگو و ایندی سیلیکان گرافیکس به قیمت های باورنکردنی پایین فروخته می شوند و هکر ها را وسوسه می کنند که چند تا از این ماشین های افسانه ای را بخرند، مگر بر روی همین ماشین ها مهم ترین اتفاقات مرتبط به رسانه و تکنولوژی اطلاعات در دهه پیش رخ نداده است؟ ولی حقیقت تلخی که وجود دارد این است که تا زمانی که یک واسط برنامه نویسی کاربرد -Application Programming Interface-API- منسجم وجود نداشته باشد که این ماشین ها را در یک کلاستر به هم متصل کند این ماشین ها تنها جنبه افسانه ای خواهد داشت و دردی را از کسی دوا نخواهند کرد. تکنولوژی موسوم به ماشین مجازی ناهمگن موازی -Heterogeneous Parallel Virtual Machine- برای حل این مساله طراحی و معرفی گردیده است. HPVM تلاش می کند مفهوم کلاسترینگ را از چنگ دانشگاه ها و مراکز تحقیقاتی خارج کرده و در اختیار همگان قرار دهد.
+ نوشته شده در  شنبه نوزدهم آبان 1386ساعت 15:55  توسط یوسف عبدلیان باریکرسفی  | 

آی.بی.ام : نرم افزار کلاسترینگ در یک سی دی

آی.بی.ام : نرم افزار کلاسترینگ در یک سی دیآی.بی.ام : نرم افزار کلاسترینگ در یک سی دی در یکی از عجیب و غریب ترین مقالاتی که به زندگی ام دیده ام، مایانک شارما -Mayank Sharma - از آی.بی.ام روشی برای راه اندازی یک سیستم کلاسترینگ را پیشنهاد می کند که نه از نرم افزار های چند هزار دلاری اچ.پی ، سان و آی.بی.ام، و نه از سرور های سوپردام، اولترا اسپارک و پاور استفاده نمی کند. نسخه مایانک شامل یک سی دی Knopix ( قابل تهیه از سوپر مارکت محل شما در تهران) و پی سی های معمولی می شود. این مقاله به طرز عجیبی من را به یاد ماجرای گداخت سرد -Cold Fusion- می اندازد. یعنی واقعا به همین سادگی است؟

Craft a load-balancing cluster with ClusterKnoppix

+ نوشته شده در  شنبه نوزدهم آبان 1386ساعت 15:54  توسط یوسف عبدلیان باریکرسفی  | 

ویندوز هم کلاسترینگ را پشتیبانی خواهد کرد

ویندوز هم کلاسترینگ را پشتیبانی خواهد کرد

ویندوز هم کلاسترینگ را پشتیبانی خواهد کرد بلاخره بعد از سالها انتظار (دقیقا از زمانی که ویندوز ان.تی 4 به بازار آمد، که قرار بود پوزه سولاریس را به خاک بمالد(!))مایکروسافت نسخه ویندوز با قابلیت کلاسترینگ واقعی را پاییز امسال (میلادی) به بازار خواهد داد. سایت نیوز دات کام عنوان خبر را به این شکل نقل کرده است: " نسخه ویندوز برای سوپرکامپیوتر ها این پاییز به بازار می آید". این عنوان که واقعا قابل تمسخر است نشان دهند این است که احتمالا خبرنگار نیوز دات کام یا موضوع کلاسترینگ را (که تبدیل کردن کامپیوتر های معمولی به نوعی سوپر کامپیوتر است، نه سیستم عاملی برای سوپرکامپیوتر ها)نفهمیده یا این که این متد بازاریابی جدیدی از مایکروسافت است. (خدای من ، از فردا مشتریان از ما سوپر کاپیوتر ویندوز که دات نت هم داشته باشد خواهند خواست.)
حدس بزنید این نقشه های "سوپر کامپیوتر" (چه اسم پر طمطراقی، بازار یاب های سان، اچ.پی، اراکل، رد هت و اپل باید از مایکروسافت یاد بگیرند)فعلا حول چه موضوعی می گردد؟ البته مسئله مهم و بسیار فنی و عالمانه پولی که مایکروسافت می خواهد برای هر گره کلاستر از مشتریان دریافت کند.
خبر نیوز دات کام را خودتان بخوانید و بخندید.
+ نوشته شده در  شنبه نوزدهم آبان 1386ساعت 15:53  توسط یوسف عبدلیان باریکرسفی  | 

Enable existing applications for grid

Enable existing applications for grid

Introduction

Before using the techniques described in this article, make sure you are familiar with the six strategies for grid application enablement described in the following "Six strategies for grid application enablement" series of articles:

  • Part 1 provides a series overview of the six strategies, and summarizes the characteristics and benefits of each strategy.
  • Part 2: Strategy 1 Batch Anywhere and Strategy 2 Independent Concurrent Batch shows how to enable applications for the grid using these two strategies. In the first strategy, the application can run as single job on any of many computers in a grid. In the second strategy, multiple independent instances of the application can be running concurrently.
  • Part 3: Strategy 3 Parallel Batch and Strategy 4 Service discusses grid enablement using these two mutually exclusive strategies. In Strategy 3, a batch job is subdivided. Its many independent subjobs run concurrently on behalf of the user who submitted the aggregate job. Strategy 4 discusses implementations of a service-oriented architecture in a grid environment.
  • Part 4: Strategy 5 Parallel Service shows how to make many instances of a service available at once and available to be exploited in parallel by each client.

In this article, we introduce the basic architectural pattern that fits most cases when you enable existing code. Next, we discuss a few strategies to enable existing code based on the most common architectures we have encountered over the years. We introduce two basic scenarios for platform-specific distributed applications, and we will study the most common architectural variations of each scenario and the most common way to make these architectures fit into the enablement pattern. We conclude with two scenarios for Web-enabled applications (servlet-centric and database-centric).

There's a pattern

Most enablement efforts for existing code using batch-oriented grid infrastructure software are similar. There's a pattern you can follow. You can use this pattern to achieve the first three strategies of grid adoption.

The base case for these three strategies for grid adoption is a program that takes command-line parameters and uses files or databases as specified in the command-line parameters. If the program is licensed, the grid infrastructure will require license management capabilities.

In general, the first three strategies to enable an existing application to run on a grid all require the user to send requests to a client application, which acts as a requester to the grid infrastructure. The client to the client application can be the actual user or a portal. The grid infrastructure takes care of deploying the actual application, which becomes a provider to the grid infrastructure. See Figure 1.


Figure 1. Integration pattern for enabling existing code
Integration pattern for enabling existing code

The key point is that the client program (job submission driver) should talk to the grid infrastructure as if it is talking directly to the application. The simplest way to do this is by having the client program issue command-line type instructions to the virtualized application.

Implementing this scenario is easy when the application is a stand-alone program with minimal deployment requirements. But when you're dealing with integrated applications, the whole thing might require a little creativity.

Known scenarios

Enabling existing code using batch-oriented grid infrastructure software involves a finite number of known scenarios because of the state of the software industry today. Two scenarios exist, both involving essentially the same type of application:

  1. Enabling platform-specific distributed applications, which includes client/server, transactional, and batch-oriented applications written before Web applications existed.
  2. Enabling Web-enabled applications, which includes those platform-specific distributed applications that were given Web "front ends" or "wrappers" to make them work as Web applications. In most cases, the integration strategy involves enabling servlet-centric or database-centric applications.



Back to top


Enabling platform-specific distributed applications

As previously mentioned, this scenario applies to client/server, transactional, and batch-oriented applications. Two architectural tendencies prevail in these three types of applications: monolithic and modular.

The case of monolithic applications

In general, deploying platform-specific monolithic applications on batch-oriented grid infrastructure software is as simple as installing the application on all grid nodes and writing some "glue code" that will integrate user requests, parameter passing, and program calls from the grid infrastructure. See Figure 2.


Figure 2. Deploying monolithic applications using the standard enablement pattern
Deploying monolithic applications using the standard enablement pattern

This "glue code" we're talking about is what makes this an integration job. In most cases, you can integrate a monolithic application and a batch-oriented grid infrastructure product through scripting. You can use Perl, Python, or ordinary shell scripts to integrate user requests, parameter passing, and application calls within the context of the grid infrastructure software.

There's a caveat

The caveat has to do with what a monolithic application does and how it does it. Monolithic applications tend to try to be all things to all people. That's one reason they're monolithic (modules are not trustworthy to some people). Sometimes, monolithic applications have, among other things, embedded grid functionality. Embedded items and the way in which such functionality was implemented will determine whether the application can run on a grid.

For instance, a monolithic application that does not have any built-in grid infrastructure functionality will be easier to enable than others. An example would be a similar application that, on top of doing what it is supposed to do, also takes care of tasks such as scheduling which instance processes what request, or which database tables need to be locked on behalf of a given user, or when transaction affinity needs to be enforced. However, we still need to look into how the built-in grid functionality was implemented.

If the built-in grid functionality can be turned off from within the application, then this monolithic piece of code will be able to run on top of batch-oriented grid infrastructure software with no problem. If, on the other hand, the grid functionality is built deep into the application and it cannot be turned off, then we have a problem.

In general, the extent of the work needed to turn off built-in grid infrastructure functionality is very high. When programmers were learning to reuse code, they also learned to abstract functionality both ways: up and down. Embedding grid infrastructure features in business logic frameworks is an example of abstracting functionality down.

We can't blame programmers for doing this. At the time when most platform-specific distributed applications were written, very few people were thinking about grid computing. At the time, nobody even considered the possibility of finding ready-made containers for their applications, much less complete grid infrastructures where they could just deploy their code and move on with their lives.

The case of modular applications

In general, enabling modular platform-specific distributed applications will be easier than enabling monolithic applications. The reason I say this is that modular applications give you choices on how to deploy them. There are caveats just as in the previous case, but with modular applications, easier ways to get around them.

Turning off modules

One of the main advantages modular applications have is the possibility of turning off modules when necessary. This way, any environment-related functionality can be rendered to the grid infrastructure.

As in the case of monolithic applications, the existence of built-in grid features and whether they can be turned off or stripped out will also determine the degree of difficulty for the grid enablement effort.

The difference is that most modular applications, when they have any built-in grid features, will most likely concentrate that functionality in a single module, or a group of specialized modules. This should make it easier (in theory) to turn those modules off or to just eliminate them altogether.

There's another caveat

This caveat is inter-module communication. The degree of difficulty in turning off, or stripping out, application modules depends on how the designers implemented inter-module communication. In general, the simpler the transport, the lower the degree of difficulty.

For instance, it is common for applications of this kind to have a dedicated module to handle all database calls. In some cases, the module not only acts as a universal database client by supporting ODBC or JDBC drivers for several vendors but also does something we can call "table access scheduling," which is sort of an intra-application table-locking mechanism that allows the application to handle table locking independent of the database.

Having a universal database client is a good idea. However, if the application is to be grid-enabled, it is better to leave table locking to the data grid infrastructure (let's assume that's what we're doing with the application). So, all we need to do is substitute the module for a regular database client, deploy the RDBMS into the data grid, and we've got ourselves a grid-enabled application.

Most database clients and listeners rely on TCP/IP sockets to get their orders from a program. The DB2® client listens by default on port 50000, for example. But what if the application designers decided that the tried-and-true way of TCP/IP sockets was not fancy enough for their application? What if they decided to go with a proprietary mechanism for inter-module? Then the problem is not so straightforward anymore.

There is another aspect to inter-module communication that can turn out to be a show-stopper. If the application is to be deployed on a computational grid, there can be several instances of several modules running concurrently on the grid. If the grid infrastructure software cannot relay the transport mechanism, or if the transport mechanism itself cannot function on a grid environment, the application simply will not work as expected.

Then, the degree of difficulty for the grid enablement project will be directly proportional to the effort of replacing the inter-module communication mechanism.

Deployment strategies: best-case scenario

An ideal modular application should handle inter-module communication with a dispatcher -- or broker -- module. This plan would allow modules to be deployed anywhere on the grid because inter-module communication will always happen through the broker. See Figure 3.


Figure 3. Ideal deployment for modular applications
Ideal deployment for modular applications

The best-case scenario has two very desirable behaviors. First, application modules should be atomic to allow independent deployment from one another. The dispatcher-broker module should take care of all inter-module communication and data exchange.

As for shared libraries, the grid infrastructure should be able to handle them if they're installed as part of a system-wide installation. If not, they can be included as part of the provisioning policy for all modules so that all nodes have a local copy.

Second, application modules should be granular enough to allow for multiple instances of the same module to run concurrently (at least) on the same machine. A module should take care of its own results aggregation. Application-level results aggregation can be handled by the dispatcher-broker module or by through the database.

Deployment strategies: most-common scenario

Unfortunately, most modular applications are not that well behaved. In some cases, module encapsulation is not atomic enough to allow for true independent deployment. In other cases, the dispatcher-broker module doesn't take care of all inter-module communication and data exchange. Instead, some modules call on each other directly, which forces them to reside on the same machine.

Results aggregation also represents a problem, especially when the dispatcher-broker doesn't fully own the task of managing inter-module communication. Some modules might feed their results into other modules instead of just passing them back to the broker. Whatever the situation, the most common scenario is to deploy the entire application on all grid nodes as shown in Figure 4.


Figure 4. Most common scenario for deploying modular applications
Most common scenario for deploying modular applications

To an extent, the most common scenario means that a modular application can be deployed as a monolithic application if worse comes to worst. It should work but the advantage of being modular will be lost because it won't be exploited by the grid infrastructure as in the ideal case.

In the same vein, think of a monolithic application as a single-module modular application and treat it as such when devising a strategy for managing results aggregation in the case of multiple concurrent instances.

These two cases, monolithic and modular applications, represent the simplest scenario when it comes to grid enabling platform-specific distributed applications. The situation changes dramatically when we deal with Web-enabled applications, as you'll see in the next section.



Back to top


Enabling Web-enabled applications

A Web-enabled application is not a true J2EE application. We call "Web-enabled" those applications that were written originally as platform-specific distributed applications but run as Web applications, thanks to a Web front end.

The most common architectures are known as servlet-centric and database-centric, and each poses its own challenges to grid enablement.

Servlet-centric applications

Servlet-centric applications, in general, follow the architectural pattern illustrated in Figure 5.


Figure 5. Most common Servlet-centric architectural pattern
Most common servlet-centric architectural pattern

The platform-specific application in Figure 5 is, in some cases, patched up to support things such as XML and other technologies. It is enabled to talk to a Java™ Virtual Machine via JNI or a proprietary connector framework. As for database support, it is common to use ODBC-to-JDBC bridges or to just stay with ODBC.

When "ported" to run on J2EE application servers, servlet-centric applications interact with clients via a gateway servlet, which relays requests to the connector and thus to the actual application. A typical porting to an application server, such as IBM® WebSphere® Application Server, looks like the one illustrated in Figure 6.


Figure 6. Typical servlet-centric Web enablement strategy
Typical servlet-centric Web enablement strategy

Enabling an application that follows this execution pattern run on a grid would require at least the following steps:

  1. Modify the deployment descriptor so that the only component being actually deployed on WebSphere Application Server is the gateway servlet, which will become the portal for the users.
  2. Take the Java piece of the connector framework (JNI or proprietary) that acts as an interface to the core application and deploy it as a stand-alone Java application. This will be the client program or job submission driver.
  3. Deploy the core of the application as a monolithic, or a modular application (whichever term applies) on a batch-oriented grid infrastructure.

The resulting scenario is illustrated in Figure 7.


Figure 7. Typical servlet-centric grid deployment
Typical servlet-centric grid deployment

It might be necessary to change some of the original assumptions when implementing this scenario. For instance, a deployment like the one shown in Figure 7 would probably be easier to manage if the gateway Servlet ran on WebSphere Application Server Express, which doesn't include an EJB Container, as opposed to WebSphere Application Server Advanced Edition, or Apache Tomcat. A change of this nature would actually benefit your customers because it could lower the total cost of the application.

Keep in mind this is just one way of doing it. There may be better ways to architect the deployment pattern depending on the characteristics of the application. The best solution should provide the best returns in terms of feasibility, effort, usability, and administration.

Caveats

The issues affecting monolithic and modular applications can also affect servlet-centric grid enablements. In addition, issues related to application performance can also arise.

Servlet-centric applications, in most cases, will experience performance problems at the Web container level. The use of a gateway servlet can create a sometimes nasty bottleneck that stems from, among other reasons:

  • The servlet execution model, especially if the servlet bottleneck relies on Java Server Pages (JSPs) for presentation logic
  • The latency created by connector frameworks such as JNI -- In some JVM implementations (ours, for example), JNI calls cause the JVM to have to create pointers to allocate the requested platform-specific processes. Sometimes these pointers occupy large chunks of memory and the JVM needs to refresh them continuously to avoid the garbage collection thread from picking them up while they're still active (you don't want that to happen). This creates additional overhead on the JVM, which translates into higher CPU and memory heap utilization by the Java process in which the JVM is running.

These issues have nothing to do with the grid infrastructure. They can cause problems even if the application is not running on a grid. You need to be aware that you will have to tune the application server under the new conditions once the application is deployed on a grid. Other issues can stem from the interaction of the application server and the grid infrastructure. For instance:

  • The overhead for security between the application server and the grid infrastructure. Given that grids are security freaks, and given that JNI calls don't always like to be asked to authenticate, the result sometimes is that the application has to log on to the grid infrastructure every single time it makes a connector call. This can slow things down.
  • Network latency between the application server and the grid infrastructure, especially on geographically disperse deployments.

Sometimes these problems, when they all surface at the same time, make the whole grid enablement exercise too complicated and too costly to be worth the effort. Sometimes it's better to consider the possibility of deploying the platform-specific piece of the application as a regular monolithic or modular application (whichever applies), or to rewrite the platform-specific piece as a J2EE-compliant set of components and look for an SOA-based grid infrastructure.

Database-centric applications

Database-centric applications became the de-facto standard back in the day of the client/server paradigm. Tools such as PowerBuilder, Oracle 2000, PacBase, Progress, and other products were widely used to create monolithic, footprint-heavy, and large applications that required proprietary languages and specialized skills to understand them.

Some products of this type managed to adapt to the new distributed paradigm and gave us these Web-enablement hybrids we know now as database-centric applications. Some vendors claim having re-engineered their products to be truly distributed and Web-native, but for some products, under the wraps, the old client/server, monolithic architecture remains untouched.

This situation is understandable given that most vendors even invented their own languages to describe highly complex frameworks aimed to facilitate what was called in those days "Rapid Application Prototyping and Development."

Regardless of the technical value of these products, vendors need to preserve this intellectual capital for one simple reason: A lot of money was invested in their development. Therefore, database-centric applications are going to be around for a long time.

Common implementations of database-centric applications revolve around proprietary frameworks. In most cases, these frameworks have been "patched" to support Java technology, XML, and JMS, on other now-popular industry standards. Figure 8 illustrates the most common flavor of this architecture.


Figure 8. Most common database-centric architectural pattern
Most common database-centric architectural pattern

In most cases, the application and the database are bound together in a single package, and there's no difference between business logic and database operational logic. Because of this, added support modules do not interact natively with the original application.

Porting scenarios for database-centric applications are usually done through the Web container, as shown in Figure 9.


Figure 9. Typical database-centric Web enablement scenario
Typical database-centric Web enablement scenario

The strategy is similar to the one used for servlet-centric applications. It involves writing a gateway servlet that accesses the proprietary framework through the add-on Java support modules as dependent classes, or through XML files.

Treat it as a monolithic application

The easiest way to grid-enable a database-centric application is to treat it as a monolithic application. In this case, you would need to implement the deployment pattern shown in Figure 9.

But there's an interesting characteristic about database-centric applications that might provide a more efficient way to deploy certain applications on a grid.

Virtualizing data

Database-centric applications use the database not just to store data but also to keep configuration information, workflow data, and even presentation metadata. This dependency on the database is sometimes so tight that it is impossible to separate the database from the runtime environment. In some cases, the application actually runs on top of the database engine.

In cases where the database run time is the application run time, what can be virtualized is not the business logic but the data. You need to use a data grid instead of a computational grid as in the previous scenarios.

By virtualizing a database-centric application on a data grid, you would be indirectly virtualizing the application run time and, thus, the application itself. All the data the application needs to run will be made available locally on all nodes on the grid and, whenever the application changes, the modifications will be propagated automatically.

What happens is that you get location independence for the requests going to the database while the data is propagated all across the grid. To the requesting program, the data seems to be local all the time when, in reality, it can be located anywhere on the grid. So, what runs as a single-instance batch job is the request broker (the Java interface and a driver for the thick client) while the data is virtualized by the grid infrastructure.

Depending on how the business logic is written, you might be able to virtualize the data and the business logic as shown in Figure 10.


Figure 10. Data virtualization scenario for database-centric applications
Data virtualization scenario for database-centric applications

Note that you can virtualize data and business logic on a data grid only if the business logic runs in processes triggered by the database engine, which is the case of most proprietary frameworks.

If, on the other hand, the business logic can be triggered by an outside process, such as the job submission driver, you might be able to actually separate data from business logic in what would be a combination of a computational grid for the business logic and a data grid for the database. This approach, however, might not be a feasible solution in some cases because it could introduce too much complexity into the deployment model.

Caveats

Database-centric applications usually have a characteristic of being heavy on CPU and memory usage. This demand becomes especially visible when the database run time also runs applications.

In some cases, a grid deployment exercise might not work as expected due to the overhead created by the data grid itself, plus the overhead caused by a resource-hungry database runtime environment. In other cases, when it is possible to separate the data and the application, the situation resembles the case of monolithic applications and the same issues will apply.



Back to top


Conclusion

The information in this article should give you enough ammunition to start brainstorming about the best way to grid-enable your product. If you want to grid-enable platform-specific distributed applications (both monolithic and modular) and Web-enabled applications (both servlet-centric and database-centric), this article shows you what to think about.



 

Resources

+ نوشته شده در  شنبه نوزدهم آبان 1386ساعت 14:45  توسط یوسف عبدلیان باریکرسفی  | 

دور نمایی از گرید کامپیوتینگ

Grid components: A high-level perspective

In this section, we describe at a high level the primary components of a grid environment. Depending on the grid design and its expected use, some of these components may or may not be required, and in some cases they may be combined to form a hybrid component. However, understanding the roles of the components as we describe them here will help you understand the considerations when developing grid-enabled applications.

Portal/user interface

Just as a consumer sees the power grid as a receptacle in the wall, a grid user should not see all of the complexities of the computing grid. Although the user interface can come in many forms and be application-specific, for the purposes of our discussion, let's think of it as a portal. Most users today understand the concept of a Web portal, where their browser provides a single interface to access a wide variety of information sources. A grid portal provides the interface for a user to launch applications that will use the resources and services provided by the grid. From this perspective, the user sees the grid as a virtual computing resource just as the consumer of power sees the receptacle as an interface to a virtual generator.


Figure 1. Possible user view of a grid
Figure 1. Possible user view of a grid

The current Globus Toolkit does not provide any services or tools to generate a portal, but this can be accomplished with tools such as WebSphere® Portal and WebSphere Application Server.

Security

A major requirement for grid computing is security. At the base of any grid environment, there must be mechanisms to provide security, including authentication, authorization, data encryption, and so on. The Grid Security Infrastructure (GSI) component of the Globus Toolkit provides robust security mechanisms. The GSI includes an OpenSSL implementation. It also provides a single sign-on mechanism, so that once a user is authenticated, a proxy certificate is created and used when performing actions within the grid. When designing your grid environment, you may use the GSI sign-in to grant access to the portal, or you may have your own security for the portal. The portal will then be responsible for signing in to the grid, either using the user's credentials or using a generic set of credentials for all authorized users of the portal.


Figure 2. Security in a grid environment
Figure 2. Security in a grid environment

Broker

Once authenticated, the user will be launching an application. Based on the application, and possibly on other parameters provided by the user, the next step is to identify the available and appropriate resources to use within the grid. This task could be carried out by a broker function. Although there is no broker implementation provided by Globus, there is an LDAP-based information service. This service is called the Grid Information Service (GIS), or more commonly the Monitoring and Discovery Service (MDS). This service provides information about the available resources within the grid and their status. A broker service could be developed that utilizes MDS.


Figure 3. Broker service
Figure 3. Broker service

Scheduler

Once the resources have been identified, the next logical step is to schedule the individual jobs to run on them. If a set of stand-alone jobs are to be executed with no interdependencies, then a specialized scheduler may not be required. However, if you want to reserve a specific resource or ensure that different jobs within the application run concurrently (for instance, if they require inter-process communication), then a job scheduler should be used to coordinate the execution of the jobs. The Globus Toolkit does not include such a scheduler, but there are several schedulers available that have been tested with and can be used in a Globus grid environment. It should also be noted that there could be different levels of schedulers within a grid environment. For instance, a cluster could be represented as a single resource. The cluster may have its own scheduler to help manage the nodes it contains. A higher level scheduler (sometimes called a meta scheduler) might be used to schedule work to be done on a cluster, while the cluster's scheduler would handle the actual scheduling of work on the cluster's individual nodes.


Figure 4. Scheduler
Caption for sample figure

Data management

If any data -- including application modules -- must be moved or made accessible to the nodes where an application's jobs will execute, then there needs to be a secure and reliable method for moving files and data to various nodes within the grid. The Globus Toolkit contains a data management component that provides such services. This component, known as Grid Access to Secondary Storage (GASS), includes facilities such as GridFTP. GridFTP is built on top of the standard FTP protocol, but adds additional functions and utilizes the GSI for user authentication and authorization. Therefore, once a user has an authenticated proxy certificate, he can use the GridFTP facility to move files without having to go through a login process to every node involved. This facility provides third-party file transfer so that one node can initiate a file transfer between two other nodes.


Figure 5. Data management
Figure 5. Data management

Job and resource management

With all the other facilities we have just discussed in place, we now get to the core set of services that help perform actual work in a grid environment. The Grid Resource Allocation Manager (GRAM) provides the services to actually launch a job on a particular resource, check its status, and retrieve its results when it is complete.


Figure 6. GRAM
Figure 6. GRAM

Other facilities

There are other facilities that may need to be included in your grid environment and considered when designing and implementing your application. For instance, inter-process communication and accounting/chargeback services are two common facilities that are often required.

+ نوشته شده در  شنبه نوزدهم آبان 1386ساعت 14:38  توسط یوسف عبدلیان باریکرسفی  |